Overview

Dataset statistics

Number of variables42
Number of observations199523
Missing cells0
Missing cells (%)0.0%
Duplicate rows3229
Duplicate rows (%)1.6%
Total size in memory413.3 MiB
Average record size in memory2.1 KiB

Variable types

CAT32
NUM10

Warnings

Dataset has 3229 (1.6%) duplicate rows Duplicates
state_of_previous_residence has a high cardinality: 51 distinct values High cardinality
state_of_previous_residence is highly correlated with region_of_previous_residenceHigh correlation
region_of_previous_residence is highly correlated with state_of_previous_residenceHigh correlation
detailed_household_summary_in_household is highly correlated with detailed_household_and_family_statHigh correlation
detailed_household_and_family_stat is highly correlated with detailed_household_summary_in_householdHigh correlation
live_in_this_house_1_year_ago is highly correlated with migration_code-change_in_msa and 4 other fieldsHigh correlation
migration_code-change_in_msa is highly correlated with live_in_this_house_1_year_ago and 1 other fieldsHigh correlation
migration_code-change_in_reg is highly correlated with live_in_this_house_1_year_ago and 1 other fieldsHigh correlation
migration_code-move_within_reg is highly correlated with live_in_this_house_1_year_ago and 1 other fieldsHigh correlation
migration_prev_res_in_sunbelt is highly correlated with live_in_this_house_1_year_ago and 1 other fieldsHigh correlation
year is highly correlated with migration_code-change_in_msa and 4 other fieldsHigh correlation
dividends_from_stocks is highly skewed (γ1 = 27.78650179) Skewed
age has 2839 (1.4%) zeros Zeros
detailed_industry_recode has 100684 (50.5%) zeros Zeros
detailed_occupation_recode has 100684 (50.5%) zeros Zeros
wage_per_hour has 188219 (94.3%) zeros Zeros
capital_gains has 192144 (96.3%) zeros Zeros
capital_losses has 195617 (98.0%) zeros Zeros
dividends_from_stocks has 178382 (89.4%) zeros Zeros
num_persons_worked_for_employer has 95983 (48.1%) zeros Zeros
weeks_worked_in_year has 95983 (48.1%) zeros Zeros

Reproduction

Analysis started2020-11-13 10:21:46.997330
Analysis finished2020-11-13 10:23:59.947266
Duration2 minutes and 12.95 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

age
Real number (ℝ≥0)

ZEROS

Distinct91
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.49419866
Minimum0
Maximum90
Zeros2839
Zeros (%)1.4%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile3
Q115
median33
Q350
95-th percentile75
Maximum90
Range90
Interquartile range (IQR)35

Descriptive statistics

Standard deviation22.31089521
Coefficient of variation (CV)0.6468013774
Kurtosis-0.7328243009
Mean34.49419866
Median Absolute Deviation (MAD)17
Skewness0.3732904573
Sum6882386
Variance497.7760449
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3434891.7%
 
3534501.7%
 
3633531.7%
 
3133511.7%
 
3333401.7%
 
533321.7%
 
433181.7%
 
332791.6%
 
3732781.6%
 
3832771.6%
 
232361.6%
 
732181.6%
 
3032031.6%
 
3231881.6%
 
831871.6%
 
631711.6%
 
931621.6%
 
1331521.6%
 
3931441.6%
 
131381.6%
 
4131341.6%
 
1031341.6%
 
1131281.6%
 
4031141.6%
 
1430681.5%
 
Other values (66)11867959.5%
 
ValueCountFrequency (%) 
028391.4%
 
131381.6%
 
232361.6%
 
332791.6%
 
433181.7%
 
533321.7%
 
631711.6%
 
732181.6%
 
831871.6%
 
931621.6%
 
ValueCountFrequency (%) 
907250.4%
 
891950.1%
 
882410.1%
 
873010.2%
 
863480.2%
 
854230.2%
 
845190.3%
 
835610.3%
 
826150.3%
 
817200.4%
 

class_of_worker
Categorical

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
100245 
Private
72028 
Self-employed-not incorporated
 
8445
Local government
 
7784
State government
 
4227
Other values (4)
 
6794
ValueCountFrequency (%) 
Not in universe10024550.2%
 
Private7202836.1%
 
Self-employed-not incorporated84454.2%
 
Local government77843.9%
 
State government42272.1%
 
Self-employed-incorporated32651.6%
 
Federal government29251.5%
 
Never worked4390.2%
 
Without pay1650.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length30
Median length15
Mean length13.02115546
Min length7

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e36062413.9%
 
i28439310.9%
 
n2505179.6%
 
2244758.6%
 
t2161488.3%
 
r2144328.3%
 
v1876487.2%
 
o1671446.4%
 
N1006843.9%
 
u1004103.9%
 
s1002453.9%
 
a988393.8%
 
P720282.8%
 
l341291.3%
 
d267841.0%
 
m266461.0%
 
p235850.9%
 
-234200.9%
 
c194940.8%
 
S159370.6%
 
g149360.6%
 
y118750.5%
 
f117100.5%
 
L77840.3%
 
F29250.1%
 
Other values (4)1208< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter215060282.8%
 
Space Separator2244758.6%
 
Uppercase Letter1995237.7%
 
Dash Punctuation234200.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N10068450.5%
 
P7202836.1%
 
S159378.0%
 
L77843.9%
 
F29251.5%
 
W1650.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e36062416.8%
 
i28439313.2%
 
n25051711.6%
 
t21614810.1%
 
r21443210.0%
 
v1876488.7%
 
o1671447.8%
 
u1004104.7%
 
s1002454.7%
 
a988394.6%
 
l341291.6%
 
d267841.2%
 
m266461.2%
 
p235851.1%
 
c194940.9%
 
g149360.7%
 
y118750.6%
 
f117100.5%
 
w439< 0.1%
 
k439< 0.1%
 
h165< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
224475100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-23420100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin235012590.5%
 
Common2478959.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e36062415.3%
 
i28439312.1%
 
n25051710.7%
 
t2161489.2%
 
r2144329.1%
 
v1876488.0%
 
o1671447.1%
 
N1006844.3%
 
u1004104.3%
 
s1002454.3%
 
a988394.2%
 
P720283.1%
 
l341291.5%
 
d267841.1%
 
m266461.1%
 
p235851.0%
 
c194940.8%
 
S159370.7%
 
g149360.6%
 
y118750.5%
 
f117100.5%
 
L77840.3%
 
F29250.1%
 
w439< 0.1%
 
k439< 0.1%
 
Other values (2)330< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
22447590.6%
 
-234209.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2598020100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e36062413.9%
 
i28439310.9%
 
n2505179.6%
 
2244758.6%
 
t2161488.3%
 
r2144328.3%
 
v1876487.2%
 
o1671446.4%
 
N1006843.9%
 
u1004103.9%
 
s1002453.9%
 
a988393.8%
 
P720282.8%
 
l341291.3%
 
d267841.0%
 
m266461.0%
 
p235850.9%
 
-234200.9%
 
c194940.8%
 
S159370.6%
 
g149360.6%
 
y118750.5%
 
f117100.5%
 
L77840.3%
 
F29250.1%
 
Other values (4)1208< 0.1%
 

detailed_industry_recode
Real number (ℝ≥0)

ZEROS

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.35232028
Minimum0
Maximum51
Zeros100684
Zeros (%)50.5%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q333
95-th percentile44
Maximum51
Range51
Interquartile range (IQR)33

Descriptive statistics

Standard deviation18.0671288
Coefficient of variation (CV)1.17683376
Kurtosis-1.501107921
Mean15.35232028
Median Absolute Deviation (MAD)0
Skewness0.5166876791
Sum3063141
Variance326.421143
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
010068450.5%
 
33170708.6%
 
4382834.2%
 
459843.0%
 
4246832.3%
 
4544822.2%
 
2942092.1%
 
3740222.0%
 
4139642.0%
 
3235961.8%
 
3533801.7%
 
3929371.5%
 
3427651.4%
 
4425491.3%
 
221961.1%
 
1117640.9%
 
5017040.9%
 
4016510.8%
 
4716440.8%
 
3816290.8%
 
2415030.8%
 
1213500.7%
 
1913460.7%
 
3011810.6%
 
3111780.6%
 
Other values (27)137696.9%
 
ValueCountFrequency (%) 
010068450.5%
 
18270.4%
 
221961.1%
 
35630.3%
 
459843.0%
 
55530.3%
 
65540.3%
 
74220.2%
 
85500.3%
 
99930.5%
 
ValueCountFrequency (%) 
5136< 0.1%
 
5017040.9%
 
496100.3%
 
486520.3%
 
4716440.8%
 
461870.1%
 
4544822.2%
 
4425491.3%
 
4382834.2%
 
4246832.3%
 

detailed_occupation_recode
Real number (ℝ≥0)

ZEROS

Distinct47
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.30655614
Minimum0
Maximum46
Zeros100684
Zeros (%)50.5%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q326
95-th percentile38
Maximum46
Range46
Interquartile range (IQR)26

Descriptive statistics

Standard deviation14.45420392
Coefficient of variation (CV)1.278391381
Kurtosis-0.8965333655
Mean11.30655614
Median Absolute Deviation (MAD)0
Skewness0.829238138
Sum2255918
Variance208.9240109
MonotocityNot monotonic
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%) 
010068450.5%
 
287564.4%
 
2678874.0%
 
1954132.7%
 
2951052.6%
 
3641452.1%
 
3440252.0%
 
1036831.8%
 
1634451.7%
 
2333921.7%
 
1233401.7%
 
3333251.7%
 
331951.6%
 
3531681.6%
 
3830031.5%
 
3126991.4%
 
3223981.2%
 
3722341.1%
 
821511.1%
 
4219181.0%
 
3018971.0%
 
2418470.9%
 
1717710.9%
 
2816610.8%
 
4415920.8%
 
Other values (22)167898.4%
 
ValueCountFrequency (%) 
010068450.5%
 
15440.3%
 
287564.4%
 
331951.6%
 
413640.7%
 
58550.4%
 
64410.2%
 
77310.4%
 
821511.1%
 
97380.4%
 
ValueCountFrequency (%) 
4636< 0.1%
 
451720.1%
 
4415920.8%
 
4313820.7%
 
4219181.0%
 
4115920.8%
 
406170.3%
 
3910170.5%
 
3830031.5%
 
3722341.1%
 

education
Categorical

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
High school graduate
48407 
Children
47422 
Some college but no degree
27820 
Bachelors degree(BA AB BS)
19865 
7th and 8th grade
8007 
Other values (12)
48002 
ValueCountFrequency (%) 
High school graduate4840724.3%
 
Children4742223.8%
 
Some college but no degree2782013.9%
 
Bachelors degree(BA AB BS)1986510.0%
 
7th and 8th grade80074.0%
 
10th grade75573.8%
 
11th grade68763.4%
 
Masters degree(MA MS MEng MEd MSW MBA)65413.3%
 
9th grade62303.1%
 
Associates degree-occup /vocational53582.7%
 
Associates degree-academic program43632.2%
 
5th or 6th grade32771.6%
 
12th grade no diploma21261.1%
 
1st 2nd 3rd or 4th grade17990.9%
 
Prof school degree (MD DDS DVM LLB JD)17930.9%
 
Doctorate degree(PhD EdD)12630.6%
 
Less than 1st grade8190.4%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length38
Median length20
Mean length18.86398561
Min length8

Overview of Unicode Properties

Unique unicode characters47
Unique unicode categories8 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e45956112.2%
 
41379911.0%
 
o2475306.6%
 
r2445866.5%
 
g2392326.4%
 
d2254216.0%
 
h2151325.7%
 
a2056525.5%
 
l1806114.8%
 
t1509664.0%
 
c1336693.6%
 
i1173973.1%
 
s1165663.1%
 
n998922.7%
 
B877942.3%
 
u815852.2%
 
S625601.7%
 
A625331.7%
 
M493731.3%
 
H484071.3%
 
C474221.3%
 
m386721.0%
 
(294620.8%
 
)294620.8%
 
b278200.7%
 
Other values (22)1486954.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter280329074.5%
 
Space Separator41379911.0%
 
Uppercase Letter40277610.7%
 
Decimal Number699311.9%
 
Open Punctuation294620.8%
 
Close Punctuation294620.8%
 
Dash Punctuation97210.3%
 
Other Punctuation53580.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B8779421.8%
 
S6256015.5%
 
A6253315.5%
 
M4937312.3%
 
H4840712.0%
 
C4742211.8%
 
E143453.6%
 
D127543.2%
 
W65411.6%
 
L44051.1%
 
P30560.8%
 
V17930.4%
 
J17930.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e45956116.4%
 
o2475308.8%
 
r2445868.7%
 
g2392328.5%
 
d2254218.0%
 
h2151327.7%
 
a2056527.3%
 
l1806116.4%
 
t1509665.4%
 
c1336694.8%
 
i1173974.2%
 
s1165664.2%
 
n998923.6%
 
u815852.9%
 
m386721.4%
 
b278201.0%
 
p118470.4%
 
v53580.2%
 
f17930.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
413799100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
12605337.3%
 
7800711.4%
 
8800711.4%
 
0755710.8%
 
962308.9%
 
239255.6%
 
532774.7%
 
632774.7%
 
317992.6%
 
417992.6%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(29462100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)29462100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-9721100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/5358100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin320606685.2%
 
Common55773314.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e45956114.3%
 
o2475307.7%
 
r2445867.6%
 
g2392327.5%
 
d2254217.0%
 
h2151326.7%
 
a2056526.4%
 
l1806115.6%
 
t1509664.7%
 
c1336694.2%
 
i1173973.7%
 
s1165663.6%
 
n998923.1%
 
B877942.7%
 
u815852.5%
 
S625602.0%
 
A625332.0%
 
M493731.5%
 
H484071.5%
 
C474221.5%
 
m386721.2%
 
b278200.9%
 
E143450.4%
 
D127540.4%
 
p118470.4%
 
Other values (7)247390.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
41379974.2%
 
(294625.3%
 
)294625.3%
 
1260534.7%
 
-97211.7%
 
780071.4%
 
880071.4%
 
075571.4%
 
962301.1%
 
/53581.0%
 
239250.7%
 
532770.6%
 
632770.6%
 
317990.3%
 
417990.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3763799100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e45956112.2%
 
41379911.0%
 
o2475306.6%
 
r2445866.5%
 
g2392326.4%
 
d2254216.0%
 
h2151325.7%
 
a2056525.5%
 
l1806114.8%
 
t1509664.0%
 
c1336693.6%
 
i1173973.1%
 
s1165663.1%
 
n998922.7%
 
B877942.3%
 
u815852.2%
 
S625601.7%
 
A625331.7%
 
M493731.3%
 
H484071.3%
 
C474221.3%
 
m386721.0%
 
(294620.8%
 
)294620.8%
 
b278200.7%
 
Other values (22)1486954.0%
 

wage_per_hour
Real number (ℝ≥0)

ZEROS

Distinct1240
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.42690818
Minimum0
Maximum9999
Zeros188219
Zeros (%)94.3%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile495
Maximum9999
Range9999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation274.8964539
Coefficient of variation (CV)4.959620931
Kurtosis155.2188969
Mean55.42690818
Median Absolute Deviation (MAD)0
Skewness8.935096531
Sum11058943
Variance75568.06037
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
018821994.3%
 
5007340.4%
 
6005460.3%
 
7005340.3%
 
8005070.3%
 
10003860.2%
 
4253760.2%
 
9003360.2%
 
5502800.1%
 
12002560.1%
 
11002350.1%
 
6502290.1%
 
4502220.1%
 
15002210.1%
 
7502020.1%
 
13001980.1%
 
8501670.1%
 
5251470.1%
 
16001360.1%
 
14001320.1%
 
18001270.1%
 
4001250.1%
 
17001160.1%
 
20001080.1%
 
4751050.1%
 
Other values (1215)48792.4%
 
ValueCountFrequency (%) 
018821994.3%
 
201< 0.1%
 
701< 0.1%
 
752< 0.1%
 
10011< 0.1%
 
1101< 0.1%
 
1251< 0.1%
 
1351< 0.1%
 
1431< 0.1%
 
1506< 0.1%
 
ValueCountFrequency (%) 
99991< 0.1%
 
99161< 0.1%
 
98002< 0.1%
 
94002< 0.1%
 
90001< 0.1%
 
88001< 0.1%
 
86001< 0.1%
 
85001< 0.1%
 
83001< 0.1%
 
80004< 0.1%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
186943 
High school
 
6892
College or university
 
5688
ValueCountFrequency (%) 
Not in universe18694393.7%
 
High school68923.5%
 
College or university56882.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length21
Median length15
Mean length15.03287842
Min length11

Overview of Unicode Properties

Unique unicode characters18
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
39215413.1%
 
i39215413.1%
 
e39095013.0%
 
n37957412.7%
 
o2121037.1%
 
s1995236.7%
 
r1983196.6%
 
t1926316.4%
 
u1926316.4%
 
v1926316.4%
 
N1869436.2%
 
l182680.6%
 
h137840.5%
 
g125800.4%
 
H68920.2%
 
c68920.2%
 
C56880.2%
 
y56880.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter240772880.3%
 
Space Separator39215413.1%
 
Uppercase Letter1995236.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N18694393.7%
 
H68923.5%
 
C56882.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i39215416.3%
 
e39095016.2%
 
n37957415.8%
 
o2121038.8%
 
s1995238.3%
 
r1983198.2%
 
t1926318.0%
 
u1926318.0%
 
v1926318.0%
 
l182680.8%
 
h137840.6%
 
g125800.5%
 
c68920.3%
 
y56880.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
392154100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin260725186.9%
 
Common39215413.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i39215415.0%
 
e39095015.0%
 
n37957414.6%
 
o2121038.1%
 
s1995237.7%
 
r1983197.6%
 
t1926317.4%
 
u1926317.4%
 
v1926317.4%
 
N1869437.2%
 
l182680.7%
 
h137840.5%
 
g125800.5%
 
H68920.3%
 
c68920.3%
 
C56880.2%
 
y56880.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
392154100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2999405100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
39215413.1%
 
i39215413.1%
 
e39095013.0%
 
n37957412.7%
 
o2121037.1%
 
s1995236.7%
 
r1983196.6%
 
t1926316.4%
 
u1926316.4%
 
v1926316.4%
 
N1869436.2%
 
l182680.6%
 
h137840.5%
 
g125800.4%
 
H68920.2%
 
c68920.2%
 
C56880.2%
 
y56880.2%
 

marital_stat
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Never married
86485 
Married-civilian spouse present
84222 
Divorced
12710 
Widowed
10463 
Separated
 
3460
Other values (2)
 
2183
ValueCountFrequency (%) 
Never married8648543.3%
 
Married-civilian spouse present8422242.2%
 
Divorced127106.4%
 
Widowed104635.2%
 
Separated34601.7%
 
Married-spouse absent15180.8%
 
Married-A F spouse present6650.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length31
Median length13
Mean length19.99977947
Min length7

Overview of Unicode Properties

Unique unicode characters26
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e63365015.9%
 
r53332213.4%
 
i44872911.2%
 
a2655506.7%
 
s2592156.5%
 
2584426.5%
 
d2099865.3%
 
v1834174.6%
 
p1747524.4%
 
n1706274.3%
 
o1095782.7%
 
c969322.4%
 
t898652.3%
 
N864852.2%
 
m864852.2%
 
M864052.2%
 
-864052.2%
 
u864052.2%
 
l842222.1%
 
D127100.3%
 
W104630.3%
 
w104630.3%
 
S34600.1%
 
b1518< 0.1%
 
A665< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter344471686.3%
 
Space Separator2584426.5%
 
Uppercase Letter2008535.0%
 
Dash Punctuation864052.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N8648543.1%
 
M8640543.0%
 
D127106.3%
 
W104635.2%
 
S34601.7%
 
A6650.3%
 
F6650.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e63365018.4%
 
r53332215.5%
 
i44872913.0%
 
a2655507.7%
 
s2592157.5%
 
d2099866.1%
 
v1834175.3%
 
p1747525.1%
 
n1706275.0%
 
o1095783.2%
 
c969322.8%
 
t898652.6%
 
m864852.5%
 
u864052.5%
 
l842222.4%
 
w104630.3%
 
b1518< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
258442100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-86405100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin364556991.4%
 
Common3448478.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e63365017.4%
 
r53332214.6%
 
i44872912.3%
 
a2655507.3%
 
s2592157.1%
 
d2099865.8%
 
v1834175.0%
 
p1747524.8%
 
n1706274.7%
 
o1095783.0%
 
c969322.7%
 
t898652.5%
 
N864852.4%
 
m864852.4%
 
M864052.4%
 
u864052.4%
 
l842222.3%
 
D127100.3%
 
W104630.3%
 
w104630.3%
 
S34600.1%
 
b1518< 0.1%
 
A665< 0.1%
 
F665< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
25844274.9%
 
-8640525.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3990416100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e63365015.9%
 
r53332213.4%
 
i44872911.2%
 
a2655506.7%
 
s2592156.5%
 
2584426.5%
 
d2099865.3%
 
v1834174.6%
 
p1747524.4%
 
n1706274.3%
 
o1095782.7%
 
c969322.4%
 
t898652.3%
 
N864852.2%
 
m864852.2%
 
M864052.2%
 
-864052.2%
 
u864052.2%
 
l842222.1%
 
D127100.3%
 
W104630.3%
 
w104630.3%
 
S34600.1%
 
b1518< 0.1%
 
A665< 0.1%
 
Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe or children
100684 
Retail trade
17070 
Manufacturing-durable goods
 
9015
Education
 
8283
Manufacturing-nondurable goods
 
6897
Other values (19)
57574 
ValueCountFrequency (%) 
Not in universe or children10068450.5%
 
Retail trade170708.6%
 
Manufacturing-durable goods90154.5%
 
Education82834.2%
 
Manufacturing-nondurable goods68973.5%
 
Finance insurance and real estate61453.1%
 
Construction59843.0%
 
Business and repair services56512.8%
 
Medical except hospital46832.3%
 
Public administration46102.3%
 
Other professional services44822.2%
 
Transportation42092.1%
 
Hospital services39642.0%
 
Wholesale trade35961.8%
 
Agriculture30231.5%
 
Personal services except private HH29371.5%
 
Social services25491.3%
 
Entertainment16510.8%
 
Communications11810.6%
 
Utilities and sanitary services11780.6%
 
Private household services9450.5%
 
Mining5630.3%
 
Forestry and fisheries1870.1%
 
Armed Forces36< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length35
Median length27
Mean length23.39614982
Min length6

Overview of Unicode Properties

Unique unicode characters38
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
52788211.3%
 
e49311810.6%
 
i4547399.7%
 
n4459899.6%
 
r4441439.5%
 
o3045366.5%
 
t2420205.2%
 
s2332775.0%
 
a1907494.1%
 
c1885614.0%
 
u1872654.0%
 
d1848924.0%
 
l1800573.9%
 
v1262722.7%
 
h1155222.5%
 
N1006842.2%
 
g354100.8%
 
p335460.7%
 
M211580.5%
 
f205810.4%
 
b205220.4%
 
R170700.4%
 
-159120.3%
 
E99340.2%
 
H98380.2%
 
Other values (13)643931.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter391884383.9%
 
Space Separator52788211.3%
 
Uppercase Letter2054334.4%
 
Dash Punctuation159120.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N10068449.0%
 
M2115810.3%
 
R170708.3%
 
E99344.8%
 
H98384.8%
 
P84924.1%
 
C71653.5%
 
F63683.1%
 
B56512.8%
 
O44822.2%
 
T42092.0%
 
W35961.8%
 
A30591.5%
 
S25491.2%
 
U11780.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e49311812.6%
 
i45473911.6%
 
n44598911.4%
 
r44414311.3%
 
o3045367.8%
 
t2420206.2%
 
s2332776.0%
 
a1907494.9%
 
c1885614.8%
 
u1872654.8%
 
d1848924.7%
 
l1800574.6%
 
v1262723.2%
 
h1155222.9%
 
g354100.9%
 
p335460.9%
 
f205810.5%
 
b205220.5%
 
m86590.2%
 
x76200.2%
 
y1365< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
527882100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-15912100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin412427688.4%
 
Common54379411.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e49311812.0%
 
i45473911.0%
 
n44598910.8%
 
r44414310.8%
 
o3045367.4%
 
t2420205.9%
 
s2332775.7%
 
a1907494.6%
 
c1885614.6%
 
u1872654.5%
 
d1848924.5%
 
l1800574.4%
 
v1262723.1%
 
h1155222.8%
 
N1006842.4%
 
g354100.9%
 
p335460.8%
 
M211580.5%
 
f205810.5%
 
b205220.5%
 
R170700.4%
 
E99340.2%
 
H98380.2%
 
m86590.2%
 
P84920.2%
 
Other values (11)472421.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
52788297.1%
 
-159122.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4668070100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
52788211.3%
 
e49311810.6%
 
i4547399.7%
 
n4459899.6%
 
r4441439.5%
 
o3045366.5%
 
t2420205.2%
 
s2332775.0%
 
a1907494.1%
 
c1885614.0%
 
u1872654.0%
 
d1848924.0%
 
l1800573.9%
 
v1262722.7%
 
h1155222.5%
 
N1006842.2%
 
g354100.8%
 
p335460.7%
 
M211580.5%
 
f205810.4%
 
b205220.4%
 
R170700.4%
 
-159120.3%
 
E99340.2%
 
H98380.2%
 
Other values (13)643931.4%
 
Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
100684 
Adm support including clerical
14837 
Professional specialty
13940 
Executive admin and managerial
12495 
Other service
12099 
Other values (10)
45468 
ValueCountFrequency (%) 
Not in universe10068450.5%
 
Adm support including clerical148377.4%
 
Professional specialty139407.0%
 
Executive admin and managerial124956.3%
 
Other service120996.1%
 
Sales117835.9%
 
Precision production craft & repair105185.3%
 
Machine operators assmblrs & inspctrs63793.2%
 
Handlers equip cleaners etc41272.1%
 
Transportation and material moving40202.0%
 
Farming forestry and fishing31461.6%
 
Technicians and related support30181.5%
 
Protective services16610.8%
 
Private household services7800.4%
 
Armed Forces36< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length37
Median length15
Mean length19.74349323
Min length5

Overview of Unicode Properties

Unique unicode characters34
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
42318110.7%
 
i41471610.5%
 
e41013510.4%
 
n3590879.1%
 
r2998397.6%
 
s2603156.6%
 
t2173205.5%
 
o2091945.3%
 
a2016285.1%
 
u1612964.1%
 
c1457853.7%
 
v1341803.4%
 
l1191203.0%
 
N1006842.6%
 
p915912.3%
 
d833272.1%
 
m574281.5%
 
g376441.0%
 
f307500.8%
 
P268990.7%
 
h262020.7%
 
y170860.4%
 
&168970.4%
 
A148730.4%
 
E124950.3%
 
Other values (9)676091.7%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter329964483.8%
 
Space Separator42318110.7%
 
Uppercase Letter1995595.1%
 
Other Punctuation168970.4%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N10068450.5%
 
P2689913.5%
 
A148737.5%
 
E124956.3%
 
O120996.1%
 
S117835.9%
 
T70383.5%
 
M63793.2%
 
H41272.1%
 
F31821.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i41471612.6%
 
e41013512.4%
 
n35908710.9%
 
r2998399.1%
 
s2603157.9%
 
t2173206.6%
 
o2091946.3%
 
a2016286.1%
 
u1612964.9%
 
c1457854.4%
 
v1341804.1%
 
l1191203.6%
 
p915912.8%
 
d833272.5%
 
m574281.7%
 
g376441.1%
 
f307500.9%
 
h262020.8%
 
y170860.5%
 
x124950.4%
 
b63790.2%
 
q41270.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
423181100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
&16897100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin349920388.8%
 
Common44007811.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i41471611.9%
 
e41013511.7%
 
n35908710.3%
 
r2998398.6%
 
s2603157.4%
 
t2173206.2%
 
o2091946.0%
 
a2016285.8%
 
u1612964.6%
 
c1457854.2%
 
v1341803.8%
 
l1191203.4%
 
N1006842.9%
 
p915912.6%
 
d833272.4%
 
m574281.6%
 
g376441.1%
 
f307500.9%
 
P268990.8%
 
h262020.7%
 
y170860.5%
 
A148730.4%
 
E124950.4%
 
x124950.4%
 
O120990.3%
 
Other values (7)430151.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
42318196.2%
 
&168973.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3939281100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
42318110.7%
 
i41471610.5%
 
e41013510.4%
 
n3590879.1%
 
r2998397.6%
 
s2603156.6%
 
t2173205.5%
 
o2091945.3%
 
a2016285.1%
 
u1612964.1%
 
c1457853.7%
 
v1341803.4%
 
l1191203.0%
 
N1006842.6%
 
p915912.3%
 
d833272.1%
 
m574281.5%
 
g376441.0%
 
f307500.8%
 
P268990.7%
 
h262020.7%
 
y170860.4%
 
&168970.4%
 
A148730.4%
 
E124950.3%
 
Other values (9)676091.7%
 

race
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
White
167365 
Black
20415 
Asian or Pacific Islander
 
5835
Other
 
3657
Amer Indian Aleut or Eskimo
 
2251
ValueCountFrequency (%) 
White16736583.9%
 
Black2041510.2%
 
Asian or Pacific Islander58352.9%
 
Other36571.8%
 
Amer Indian Aleut or Eskimo22511.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length27
Median length5
Mean length5.833096936
Min length5

Overview of Unicode Properties

Unique unicode characters24
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
i18937216.3%
 
e18135915.6%
 
t17327314.9%
 
h17102214.7%
 
W16736514.4%
 
a401713.5%
 
c320852.8%
 
l285012.4%
 
265092.3%
 
k226661.9%
 
B204151.8%
 
r198291.7%
 
n161721.4%
 
s139211.2%
 
A103370.9%
 
o103370.9%
 
I80860.7%
 
d80860.7%
 
P58350.5%
 
f58350.5%
 
m45020.4%
 
O36570.3%
 
u22510.2%
 
E22510.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter91938279.0%
 
Uppercase Letter21794618.7%
 
Space Separator265092.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
W16736576.8%
 
B204159.4%
 
A103374.7%
 
I80863.7%
 
P58352.7%
 
O36571.7%
 
E22511.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i18937220.6%
 
e18135919.7%
 
t17327318.8%
 
h17102218.6%
 
a401714.4%
 
c320853.5%
 
l285013.1%
 
k226662.5%
 
r198292.2%
 
n161721.8%
 
s139211.5%
 
o103371.1%
 
d80860.9%
 
f58350.6%
 
m45020.5%
 
u22510.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
26509100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin113732897.7%
 
Common265092.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i18937216.7%
 
e18135915.9%
 
t17327315.2%
 
h17102215.0%
 
W16736514.7%
 
a401713.5%
 
c320852.8%
 
l285012.5%
 
k226662.0%
 
B204151.8%
 
r198291.7%
 
n161721.4%
 
s139211.2%
 
A103370.9%
 
o103370.9%
 
I80860.7%
 
d80860.7%
 
P58350.5%
 
f58350.5%
 
m45020.4%
 
O36570.3%
 
u22510.2%
 
E22510.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
26509100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1163837100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
i18937216.3%
 
e18135915.6%
 
t17327314.9%
 
h17102214.7%
 
W16736514.4%
 
a401713.5%
 
c320852.8%
 
l285012.4%
 
265092.3%
 
k226661.9%
 
B204151.8%
 
r198291.7%
 
n161721.4%
 
s139211.2%
 
A103370.9%
 
o103370.9%
 
I80860.7%
 
d80860.7%
 
P58350.5%
 
f58350.5%
 
m45020.4%
 
O36570.3%
 
u22510.2%
 
E22510.2%
 

hispanic_origin
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
All other
171907 
Mexican-American
 
8079
Mexican (Mexicano)
 
7234
Central or South American
 
3895
Puerto Rican
 
3313
Other values (5)
 
5095
ValueCountFrequency (%) 
All other17190786.2%
 
Mexican-American80794.0%
 
Mexican (Mexicano)72343.6%
 
Central or South American38952.0%
 
Puerto Rican33131.7%
 
Other Spanish24851.2%
 
Cuban11260.6%
 
NA8740.4%
 
Do not know3060.2%
 
Chicano3040.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length25
Median length9
Mean length9.968509896
Min length2

Overview of Unicode Properties

Unique unicode characters31
Unique unicode categories6 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l34770917.5%
 
e21612110.9%
 
r1974699.9%
 
1972369.9%
 
o1914669.6%
 
t1858019.3%
 
A1847559.3%
 
h1810769.1%
 
n462562.3%
 
a456442.3%
 
i406232.0%
 
c381381.9%
 
M225471.1%
 
x225471.1%
 
m119740.6%
 
u83340.4%
 
-80790.4%
 
(72340.4%
 
)72340.4%
 
S63800.3%
 
C53250.3%
 
P33130.2%
 
R33130.2%
 
O24850.1%
 
p24850.1%
 
Other values (6)54030.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter153986677.4%
 
Uppercase Letter22929811.5%
 
Space Separator1972369.9%
 
Dash Punctuation80790.4%
 
Open Punctuation72340.4%
 
Close Punctuation72340.4%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A18475580.6%
 
M225479.8%
 
S63802.8%
 
C53252.3%
 
P33131.4%
 
R33131.4%
 
O24851.1%
 
N8740.4%
 
D3060.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l34770922.6%
 
e21612114.0%
 
r19746912.8%
 
o19146612.4%
 
t18580112.1%
 
h18107611.8%
 
n462563.0%
 
a456443.0%
 
i406232.6%
 
c381382.5%
 
x225471.5%
 
m119740.8%
 
u83340.5%
 
p24850.2%
 
s24850.2%
 
b11260.1%
 
k306< 0.1%
 
w306< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
197236100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(7234100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)7234100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-8079100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin176916488.9%
 
Common21978311.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l34770919.7%
 
e21612112.2%
 
r19746911.2%
 
o19146610.8%
 
t18580110.5%
 
A18475510.4%
 
h18107610.2%
 
n462562.6%
 
a456442.6%
 
i406232.3%
 
c381382.2%
 
M225471.3%
 
x225471.3%
 
m119740.7%
 
u83340.5%
 
S63800.4%
 
C53250.3%
 
P33130.2%
 
R33130.2%
 
O24850.1%
 
p24850.1%
 
s24850.1%
 
b11260.1%
 
N874< 0.1%
 
D306< 0.1%
 
Other values (2)612< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
19723689.7%
 
-80793.7%
 
(72343.3%
 
)72343.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1988947100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l34770917.5%
 
e21612110.9%
 
r1974699.9%
 
1972369.9%
 
o1914669.6%
 
t1858019.3%
 
A1847559.3%
 
h1810769.1%
 
n462562.3%
 
a456442.3%
 
i406232.0%
 
c381381.9%
 
M225471.1%
 
x225471.1%
 
m119740.6%
 
u83340.4%
 
-80790.4%
 
(72340.4%
 
)72340.4%
 
S63800.3%
 
C53250.3%
 
P33130.2%
 
R33130.2%
 
O24850.1%
 
p24850.1%
 
Other values (6)54030.3%
 

sex
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Female
103984 
Male
95539 
ValueCountFrequency (%) 
Female10398452.1%
 
Male9553947.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length6
Median length6
Mean length5.042325947
Min length4

Overview of Unicode Properties

Unique unicode characters6
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e30350730.2%
 
a19952319.8%
 
l19952319.8%
 
F10398410.3%
 
m10398410.3%
 
M955399.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter80653780.2%
 
Uppercase Letter19952319.8%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F10398452.1%
 
M9553947.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e30350737.6%
 
a19952324.7%
 
l19952324.7%
 
m10398412.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1006060100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e30350730.2%
 
a19952319.8%
 
l19952319.8%
 
F10398410.3%
 
m10398410.3%
 
M955399.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1006060100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e30350730.2%
 
a19952319.8%
 
l19952319.8%
 
F10398410.3%
 
m10398410.3%
 
M955399.5%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
180459 
No
 
16034
Yes
 
3030
ValueCountFrequency (%) 
Not in universe18045990.4%
 
No160348.0%
 
Yes30301.5%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length15
Median length15
Mean length13.77306376
Min length2

Overview of Unicode Properties

Unique unicode characters12
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e36394813.2%
 
36091813.1%
 
i36091813.1%
 
n36091813.1%
 
N1964937.2%
 
o1964937.2%
 
s1834896.7%
 
t1804596.6%
 
u1804596.6%
 
v1804596.6%
 
r1804596.6%
 
Y30300.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter218760279.6%
 
Space Separator36091813.1%
 
Uppercase Letter1995237.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N19649398.5%
 
Y30301.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e36394816.6%
 
i36091816.5%
 
n36091816.5%
 
o1964939.0%
 
s1834898.4%
 
t1804598.2%
 
u1804598.2%
 
v1804598.2%
 
r1804598.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
360918100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin238712586.9%
 
Common36091813.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e36394815.2%
 
i36091815.1%
 
n36091815.1%
 
N1964938.2%
 
o1964938.2%
 
s1834897.7%
 
t1804597.6%
 
u1804597.6%
 
v1804597.6%
 
r1804597.6%
 
Y30300.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
360918100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2748043100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e36394813.2%
 
36091813.1%
 
i36091813.1%
 
n36091813.1%
 
N1964937.2%
 
o1964937.2%
 
s1834896.7%
 
t1804596.6%
 
u1804596.6%
 
v1804596.6%
 
r1804596.6%
 
Y30300.1%
 
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
193453 
Other job loser
 
2038
Re-entrant
 
2019
Job loser - on layoff
 
976
Job leaver
 
598
ValueCountFrequency (%) 
Not in universe19345397.0%
 
Other job loser20381.0%
 
Re-entrant20191.0%
 
Job loser - on layoff9760.5%
 
Job leaver5980.3%
 
New entrant4390.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length21
Median length15
Mean length14.9549676
Min length10

Overview of Unicode Properties

Unique unicode characters23
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e39807013.3%
 
39592313.3%
 
n39279813.2%
 
i38690613.0%
 
o2020316.8%
 
r2015616.8%
 
t2004076.7%
 
s1964676.6%
 
v1940516.5%
 
N1938926.5%
 
u1934536.5%
 
l45880.2%
 
a40320.1%
 
b36120.1%
 
-29950.1%
 
O20380.1%
 
h20380.1%
 
j20380.1%
 
R20190.1%
 
f19520.1%
 
J15740.1%
 
y976< 0.1%
 
w439< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter238541979.9%
 
Space Separator39592313.3%
 
Uppercase Letter1995236.7%
 
Dash Punctuation29950.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N19389297.2%
 
O20381.0%
 
R20191.0%
 
J15740.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e39807016.7%
 
n39279816.5%
 
i38690616.2%
 
o2020318.5%
 
r2015618.4%
 
t2004078.4%
 
s1964678.2%
 
v1940518.1%
 
u1934538.1%
 
l45880.2%
 
a40320.2%
 
b36120.2%
 
h20380.1%
 
j20380.1%
 
f19520.1%
 
y976< 0.1%
 
w439< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
395923100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-2995100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin258494286.6%
 
Common39891813.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e39807015.4%
 
n39279815.2%
 
i38690615.0%
 
o2020317.8%
 
r2015617.8%
 
t2004077.8%
 
s1964677.6%
 
v1940517.5%
 
N1938927.5%
 
u1934537.5%
 
l45880.2%
 
a40320.2%
 
b36120.1%
 
O20380.1%
 
h20380.1%
 
j20380.1%
 
R20190.1%
 
f19520.1%
 
J15740.1%
 
y976< 0.1%
 
w439< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
39592399.2%
 
-29950.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2983860100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e39807013.3%
 
39592313.3%
 
n39279813.2%
 
i38690613.0%
 
o2020316.8%
 
r2015616.8%
 
t2004076.7%
 
s1964676.6%
 
v1940516.5%
 
N1938926.5%
 
u1934536.5%
 
l45880.2%
 
a40320.1%
 
b36120.1%
 
-29950.1%
 
O20380.1%
 
h20380.1%
 
j20380.1%
 
R20190.1%
 
f19520.1%
 
J15740.1%
 
y976< 0.1%
 
w439< 0.1%
 
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Children or Armed Forces
123769 
Full-time schedules
40736 
Not in labor force
26808 
PT for non-econ reasons usually FT
 
3322
Unemployed full-time
 
2311
Other values (3)
 
2577
ValueCountFrequency (%) 
Children or Armed Forces12376962.0%
 
Full-time schedules4073620.4%
 
Not in labor force2680813.4%
 
PT for non-econ reasons usually FT33221.7%
 
Unemployed full-time23111.2%
 
PT for econ reasons usually PT12090.6%
 
Unemployed part- time8430.4%
 
PT for econ reasons usually FT5250.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length34
Median length24
Mean length22.33263834
Min length18

Overview of Unicode Properties

Unique unicode characters27
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r55964712.6%
 
e53989712.1%
 
52174411.7%
 
o3496067.8%
 
d2914286.5%
 
l2906736.5%
 
s2204094.9%
 
c1963694.4%
 
i1944674.4%
 
m1708133.8%
 
n1704873.8%
 
F1683523.8%
 
h1645053.7%
 
C1237692.8%
 
A1237692.8%
 
u938952.1%
 
t715411.6%
 
-472121.1%
 
a377630.8%
 
f341750.8%
 
N268080.6%
 
b268080.6%
 
T101120.2%
 
y82100.2%
 
P62650.1%
 
Other values (2)71510.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter342469076.9%
 
Space Separator52174411.7%
 
Uppercase Letter46222910.4%
 
Dash Punctuation472121.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F16835236.4%
 
C12376926.8%
 
A12376926.8%
 
N268085.8%
 
T101122.2%
 
P62651.4%
 
U31540.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r55964716.3%
 
e53989715.8%
 
o34960610.2%
 
d2914288.5%
 
l2906738.5%
 
s2204096.4%
 
c1963695.7%
 
i1944675.7%
 
m1708135.0%
 
n1704875.0%
 
h1645054.8%
 
u938952.7%
 
t715412.1%
 
a377631.1%
 
f341751.0%
 
b268080.8%
 
y82100.2%
 
p39970.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
521744100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-47212100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin388691987.2%
 
Common56895612.8%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r55964714.4%
 
e53989713.9%
 
o3496069.0%
 
d2914287.5%
 
l2906737.5%
 
s2204095.7%
 
c1963695.1%
 
i1944675.0%
 
m1708134.4%
 
n1704874.4%
 
F1683524.3%
 
h1645054.2%
 
C1237693.2%
 
A1237693.2%
 
u938952.4%
 
t715411.8%
 
a377631.0%
 
f341750.9%
 
N268080.7%
 
b268080.7%
 
T101120.3%
 
y82100.2%
 
P62650.2%
 
p39970.1%
 
U31540.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
52174491.7%
 
-472128.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4455875100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r55964712.6%
 
e53989712.1%
 
52174411.7%
 
o3496067.8%
 
d2914286.5%
 
l2906736.5%
 
s2204094.9%
 
c1963694.4%
 
i1944674.4%
 
m1708133.8%
 
n1704873.8%
 
F1683523.8%
 
h1645053.7%
 
C1237692.8%
 
A1237692.8%
 
u938952.1%
 
t715411.6%
 
-472121.1%
 
a377630.8%
 
f341750.8%
 
N268080.6%
 
b268080.6%
 
T101120.2%
 
y82100.2%
 
P62650.1%
 
Other values (2)71510.2%
 

capital_gains
Real number (ℝ≥0)

ZEROS

Distinct132
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean434.7189898
Minimum0
Maximum99999
Zeros192144
Zeros (%)96.3%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4697.53128
Coefficient of variation (CV)10.8059031
Kurtosis393.0628325
Mean434.7189898
Median Absolute Deviation (MAD)0
Skewness18.99082234
Sum86736437
Variance22066800.12
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
019214496.3%
 
150247880.4%
 
76886090.3%
 
72985820.3%
 
999993900.2%
 
31032370.1%
 
51782070.1%
 
50131580.1%
 
43861510.1%
 
33251210.1%
 
86141180.1%
 
1052098< 0.1%
 
2782894< 0.1%
 
465093< 0.1%
 
2005191< 0.1%
 
59488< 0.1%
 
406483< 0.1%
 
217483< 0.1%
 
108681< 0.1%
 
1408477< 0.1%
 
140977< 0.1%
 
1355074< 0.1%
 
282971< 0.1%
 
1060570< 0.1%
 
938670< 0.1%
 
Other values (107)28681.4%
 
ValueCountFrequency (%) 
019214496.3%
 
11411< 0.1%
 
40133< 0.1%
 
59488< 0.1%
 
91417< 0.1%
 
99159< 0.1%
 
105569< 0.1%
 
108681< 0.1%
 
10902< 0.1%
 
11114< 0.1%
 
ValueCountFrequency (%) 
999993900.2%
 
413102< 0.1%
 
3409511< 0.1%
 
2782894< 0.1%
 
2523623< 0.1%
 
2512418< 0.1%
 
220402< 0.1%
 
2005191< 0.1%
 
1848114< 0.1%
 
1583116< 0.1%
 

capital_losses
Real number (ℝ≥0)

ZEROS

Distinct113
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.31378839
Minimum0
Maximum4608
Zeros195617
Zeros (%)98.0%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4608
Range4608
Interquartile range (IQR)0

Descriptive statistics

Standard deviation271.8964284
Coefficient of variation (CV)7.286754847
Kurtosis61.63293305
Mean37.31378839
Median Absolute Deviation (MAD)0
Skewness7.6325647
Sum7444959
Variance73927.66776
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
019561798.0%
 
19024070.2%
 
19773810.2%
 
18873640.2%
 
16021930.1%
 
24151220.1%
 
148595< 0.1%
 
184888< 0.1%
 
187687< 0.1%
 
167285< 0.1%
 
159084< 0.1%
 
174072< 0.1%
 
233961< 0.1%
 
156460< 0.1%
 
198059< 0.1%
 
174158< 0.1%
 
140856< 0.1%
 
171956< 0.1%
 
200156< 0.1%
 
225856< 0.1%
 
166955< 0.1%
 
197451< 0.1%
 
200251< 0.1%
 
237748< 0.1%
 
220546< 0.1%
 
Other values (88)12150.6%
 
ValueCountFrequency (%) 
019561798.0%
 
1551< 0.1%
 
21310< 0.1%
 
32310< 0.1%
 
41929< 0.1%
 
62525< 0.1%
 
6537< 0.1%
 
7725< 0.1%
 
8105< 0.1%
 
8809< 0.1%
 
ValueCountFrequency (%) 
46084< 0.1%
 
435630< 0.1%
 
39002< 0.1%
 
37705< 0.1%
 
36834< 0.1%
 
350010< 0.1%
 
31758< 0.1%
 
300411< 0.1%
 
282427< 0.1%
 
27887< 0.1%
 

dividends_from_stocks
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct1478
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean197.5295329
Minimum0
Maximum99999
Zeros178382
Zeros (%)89.4%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile400
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1984.163658
Coefficient of variation (CV)10.04489622
Kurtosis1090.563754
Mean197.5295329
Median Absolute Deviation (MAD)0
Skewness27.78650179
Sum39411685
Variance3936905.423
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
017838289.4%
 
10011480.6%
 
50010300.5%
 
10008940.4%
 
2008660.4%
 
508320.4%
 
20005740.3%
 
2505550.3%
 
1505490.3%
 
3005230.3%
 
14720.2%
 
4004090.2%
 
15003800.2%
 
25003720.2%
 
253600.2%
 
50003040.2%
 
30002920.1%
 
6002870.1%
 
102530.1%
 
40002220.1%
 
202130.1%
 
21930.1%
 
100001820.1%
 
51790.1%
 
1251750.1%
 
Other values (1453)98775.0%
 
ValueCountFrequency (%) 
017838289.4%
 
14720.2%
 
21930.1%
 
31290.1%
 
475< 0.1%
 
51790.1%
 
61000.1%
 
793< 0.1%
 
894< 0.1%
 
956< 0.1%
 
ValueCountFrequency (%) 
9999925< 0.1%
 
950951< 0.1%
 
750005< 0.1%
 
700003< 0.1%
 
666212< 0.1%
 
600007< 0.1%
 
576781< 0.1%
 
550001< 0.1%
 
546002< 0.1%
 
545002< 0.1%
 

tax_filer_stat
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Nonfiler
75094 
Joint both under 65
67383 
Single
37421 
Joint both 65+
8332 
Head of household
 
7426
ValueCountFrequency (%) 
Nonfiler7509437.6%
 
Joint both under 656738333.8%
 
Single3742118.8%
 
Joint both 65+83324.2%
 
Head of household74263.7%
 
Joint one under 65 & one 65+38671.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length28
Median length8
Mean length12.31297144
Min length6

Overview of Unicode Properties

Unique unicode characters24
Unique unicode categories6 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n27108111.0%
 
o26040310.6%
 
25686710.5%
 
e2063518.4%
 
i1920977.8%
 
t1552976.3%
 
r1463446.0%
 
l1199414.9%
 
h905673.7%
 
d861023.5%
 
6834493.4%
 
5834493.4%
 
f825203.4%
 
J795823.2%
 
u786763.2%
 
b757153.1%
 
N750943.1%
 
S374211.5%
 
g374211.5%
 
+121990.5%
 
H74260.3%
 
a74260.3%
 
s74260.3%
 
&38670.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter181736774.0%
 
Space Separator25686710.5%
 
Uppercase Letter1995238.1%
 
Decimal Number1668986.8%
 
Math Symbol121990.5%
 
Other Punctuation38670.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J7958239.9%
 
N7509437.6%
 
S3742118.8%
 
H74263.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n27108114.9%
 
o26040314.3%
 
e20635111.4%
 
i19209710.6%
 
t1552978.5%
 
r1463448.1%
 
l1199416.6%
 
h905675.0%
 
d861024.7%
 
f825204.5%
 
u786764.3%
 
b757154.2%
 
g374212.1%
 
a74260.4%
 
s74260.4%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
256867100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
68344950.0%
 
58344950.0%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
+12199100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
&3867100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin201689082.1%
 
Common43983117.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n27108113.4%
 
o26040312.9%
 
e20635110.2%
 
i1920979.5%
 
t1552977.7%
 
r1463447.3%
 
l1199415.9%
 
h905674.5%
 
d861024.3%
 
f825204.1%
 
J795823.9%
 
u786763.9%
 
b757153.8%
 
N750943.7%
 
S374211.9%
 
g374211.9%
 
H74260.4%
 
a74260.4%
 
s74260.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
25686758.4%
 
68344919.0%
 
58344919.0%
 
+121992.8%
 
&38670.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2456721100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n27108111.0%
 
o26040310.6%
 
25686710.5%
 
e2063518.4%
 
i1920977.8%
 
t1552976.3%
 
r1463446.0%
 
l1199414.9%
 
h905673.7%
 
d861023.5%
 
6834493.4%
 
5834493.4%
 
f825203.4%
 
J795823.2%
 
u786763.2%
 
b757153.1%
 
N750943.1%
 
S374211.5%
 
g374211.5%
 
+121990.5%
 
H74260.3%
 
a74260.3%
 
s74260.3%
 
&38670.2%
 

region_of_previous_residence
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
183750 
South
 
4889
West
 
4074
Midwest
 
3575
Northeast
 
2705
ValueCountFrequency (%) 
Not in universe18375092.1%
 
South48892.5%
 
West40742.0%
 
Midwest35751.8%
 
Northeast27051.4%
 
Abroad5300.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length15
Median length15
Mean length14.28176701
Min length4

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e37785413.3%
 
i37107513.0%
 
36750012.9%
 
n36750012.9%
 
t2016987.1%
 
s1941046.8%
 
o1918746.7%
 
u1886396.6%
 
r1869856.6%
 
N1864556.5%
 
v1837506.4%
 
h75940.3%
 
S48890.2%
 
d41050.1%
 
W40740.1%
 
M35750.1%
 
w35750.1%
 
a32350.1%
 
A530< 0.1%
 
b530< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter228251880.1%
 
Space Separator36750012.9%
 
Uppercase Letter1995237.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N18645593.5%
 
S48892.5%
 
W40742.0%
 
M35751.8%
 
A5300.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e37785416.6%
 
i37107516.3%
 
n36750016.1%
 
t2016988.8%
 
s1941048.5%
 
o1918748.4%
 
u1886398.3%
 
r1869858.2%
 
v1837508.1%
 
h75940.3%
 
d41050.2%
 
w35750.2%
 
a32350.1%
 
b530< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
367500100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin248204187.1%
 
Common36750012.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e37785415.2%
 
i37107515.0%
 
n36750014.8%
 
t2016988.1%
 
s1941047.8%
 
o1918747.7%
 
u1886397.6%
 
r1869857.5%
 
N1864557.5%
 
v1837507.4%
 
h75940.3%
 
S48890.2%
 
d41050.2%
 
W40740.2%
 
M35750.1%
 
w35750.1%
 
a32350.1%
 
A530< 0.1%
 
b530< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
367500100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2849541100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e37785413.3%
 
i37107513.0%
 
36750012.9%
 
n36750012.9%
 
t2016987.1%
 
s1941046.8%
 
o1918746.7%
 
u1886396.6%
 
r1869856.6%
 
N1864556.5%
 
v1837506.4%
 
h75940.3%
 
S48890.2%
 
d41050.1%
 
W40740.1%
 
M35750.1%
 
w35750.1%
 
a32350.1%
 
A530< 0.1%
 
b530< 0.1%
 

state_of_previous_residence
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
183750 
California
 
1714
Utah
 
1063
Florida
 
849
North Carolina
 
812
Other values (46)
 
11335
ValueCountFrequency (%) 
Not in universe18375092.1%
 
California17140.9%
 
Utah10630.5%
 
Florida8490.4%
 
North Carolina8120.4%
 
?7080.4%
 
Abroad6710.3%
 
Oklahoma6260.3%
 
Minnesota5760.3%
 
Indiana5330.3%
 
North Dakota4990.3%
 
New Mexico4630.2%
 
Michigan4410.2%
 
Alaska2900.1%
 
Kentucky2440.1%
 
Arizona2430.1%
 
New Hampshire2420.1%
 
Wyoming2410.1%
 
Colorado2390.1%
 
Oregon2360.1%
 
West Virginia2310.1%
 
Georgia2270.1%
 
Montana2260.1%
 
Alabama2160.1%
 
Ohio2110.1%
 
Other values (26)39722.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length20
Median length15
Mean length14.45687465
Min length1

Overview of Unicode Properties

Unique unicode characters46
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
i38032413.2%
 
n37721813.1%
 
e37318412.9%
 
37048212.8%
 
o1954456.8%
 
r1920906.7%
 
s1893306.6%
 
t1892306.6%
 
N1863886.5%
 
u1849786.4%
 
v1841236.4%
 
a190480.7%
 
l57250.2%
 
h43090.1%
 
C30930.1%
 
d26330.1%
 
M25390.1%
 
k23750.1%
 
f18300.1%
 
c17540.1%
 
m16320.1%
 
A16250.1%
 
g15020.1%
 
w1237< 0.1%
 
b1181< 0.1%
 
Other values (21)112040.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter231160880.1%
 
Space Separator37048212.8%
 
Uppercase Letter2016817.0%
 
Other Punctuation708< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N18638892.4%
 
C30931.5%
 
M25391.3%
 
A16250.8%
 
O10730.5%
 
U10630.5%
 
I9330.5%
 
F8490.4%
 
D8260.4%
 
W5770.3%
 
V5480.3%
 
T4110.2%
 
K3930.2%
 
H2420.1%
 
S2330.1%
 
G2270.1%
 
P1990.1%
 
Y1950.1%
 
L1920.1%
 
J75< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i38032416.5%
 
n37721816.3%
 
e37318416.1%
 
o1954458.5%
 
r1920908.3%
 
s1893308.2%
 
t1892308.2%
 
u1849788.0%
 
v1841238.0%
 
a190480.8%
 
l57250.2%
 
h43090.2%
 
d26330.1%
 
k23750.1%
 
f18300.1%
 
c17540.1%
 
m16320.1%
 
g15020.1%
 
w12370.1%
 
b11810.1%
 
y895< 0.1%
 
x672< 0.1%
 
p650< 0.1%
 
z243< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
370482100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?708100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin251328987.1%
 
Common37119012.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
i38032415.1%
 
n37721815.0%
 
e37318414.8%
 
o1954457.8%
 
r1920907.6%
 
s1893307.5%
 
t1892307.5%
 
N1863887.4%
 
u1849787.4%
 
v1841237.3%
 
a190480.8%
 
l57250.2%
 
h43090.2%
 
C30930.1%
 
d26330.1%
 
M25390.1%
 
k23750.1%
 
f18300.1%
 
c17540.1%
 
m16320.1%
 
A16250.1%
 
g15020.1%
 
w1237< 0.1%
 
b1181< 0.1%
 
O1073< 0.1%
 
Other values (19)94230.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
37048299.8%
 
?7080.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2884479100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
i38032413.2%
 
n37721813.1%
 
e37318412.9%
 
37048212.8%
 
o1954456.8%
 
r1920906.7%
 
s1893306.6%
 
t1892306.6%
 
N1863886.5%
 
u1849786.4%
 
v1841236.4%
 
a190480.7%
 
l57250.2%
 
h43090.1%
 
C30930.1%
 
d26330.1%
 
M25390.1%
 
k23750.1%
 
f18300.1%
 
c17540.1%
 
m16320.1%
 
A16250.1%
 
g15020.1%
 
w1237< 0.1%
 
b1181< 0.1%
 
Other values (21)112040.4%
 

detailed_household_and_family_stat
Categorical

HIGH CORRELATION

Distinct38
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Householder
53248 
Child <18 never marr not in subfamily
50326 
Spouse of householder
41695 
Nonfamily householder
22213 
Child 18+ never marr Not in a subfamily
12030 
Other values (33)
20011 
ValueCountFrequency (%) 
Householder5324826.7%
 
Child <18 never marr not in subfamily5032625.2%
 
Spouse of householder4169520.9%
 
Nonfamily householder2221311.1%
 
Child 18+ never marr Not in a subfamily120306.0%
 
Secondary individual61223.1%
 
Other Rel 18+ ever marr not in subfamily19561.0%
 
Grandchild <18 never marr child of subfamily RP18680.9%
 
Other Rel 18+ never marr not in subfamily17280.9%
 
Grandchild <18 never marr not in subfamily10660.5%
 
Child 18+ ever marr Not in a subfamily10130.5%
 
Child under 18 of RP of unrel subfamily7320.4%
 
RP of unrelated subfamily6850.3%
 
Child 18+ ever marr RP of subfamily6710.3%
 
Other Rel 18+ ever marr RP of subfamily6560.3%
 
Other Rel <18 never marr child of subfamily RP6560.3%
 
Other Rel 18+ spouse of subfamily RP6380.3%
 
Child 18+ never marr RP of subfamily5890.3%
 
Other Rel <18 never marr not in subfamily5840.3%
 
Grandchild 18+ never marr not in subfamily3750.2%
 
In group quarters1960.1%
 
Child 18+ spouse of subfamily RP1260.1%
 
Other Rel 18+ never marr RP of subfamily94< 0.1%
 
Child <18 never marr RP of subfamily80< 0.1%
 
Spouse of RP of unrelated subfamily52< 0.1%
 
Other values (13)1240.1%
 
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length47
Median length21
Mean length24.71388762
Min length11

Overview of Unicode Properties

Unique unicode characters35
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
58815011.9%
 
e4463529.1%
 
o4238978.6%
 
r3571687.2%
 
l3008456.1%
 
h2589005.3%
 
i2572935.2%
 
u2444465.0%
 
s2367064.8%
 
n2348934.8%
 
d2118774.3%
 
a2016554.1%
 
m1720633.5%
 
f1476393.0%
 
y1043842.1%
 
v799231.6%
 
t764101.5%
 
b760491.5%
 
1753121.5%
 
8753121.5%
 
C656141.3%
 
<546451.1%
 
H532481.1%
 
S478691.0%
 
p427220.9%
 
Other values (10)976172.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter388563278.8%
 
Space Separator58815011.9%
 
Uppercase Letter2320034.7%
 
Decimal Number1506243.1%
 
Math Symbol745801.5%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C6561428.3%
 
H5324823.0%
 
S4786920.6%
 
N3525615.2%
 
R132245.7%
 
P68983.0%
 
O63262.7%
 
G33721.5%
 
I1960.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e44635211.5%
 
o42389710.9%
 
r3571689.2%
 
l3008457.7%
 
h2589006.7%
 
i2572936.6%
 
u2444466.3%
 
s2367066.1%
 
n2348936.0%
 
d2118775.5%
 
a2016555.2%
 
m1720634.4%
 
f1476393.8%
 
y1043842.7%
 
v799232.1%
 
t764102.0%
 
b760492.0%
 
p427221.1%
 
c120180.3%
 
g196< 0.1%
 
q196< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
588150100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
17531250.0%
 
87531250.0%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
<5464573.3%
 
+1993526.7%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin411763583.5%
 
Common81335416.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e44635210.8%
 
o42389710.3%
 
r3571688.7%
 
l3008457.3%
 
h2589006.3%
 
i2572936.2%
 
u2444465.9%
 
s2367065.7%
 
n2348935.7%
 
d2118775.1%
 
a2016554.9%
 
m1720634.2%
 
f1476393.6%
 
y1043842.5%
 
v799231.9%
 
t764101.9%
 
b760491.8%
 
C656141.6%
 
H532481.3%
 
S478691.2%
 
p427221.0%
 
N352560.9%
 
R132240.3%
 
c120180.3%
 
P68980.2%
 
Other values (5)102860.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
58815072.3%
 
1753129.3%
 
8753129.3%
 
<546456.7%
 
+199352.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4930989100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
58815011.9%
 
e4463529.1%
 
o4238978.6%
 
r3571687.2%
 
l3008456.1%
 
h2589005.3%
 
i2572935.2%
 
u2444465.0%
 
s2367064.8%
 
n2348934.8%
 
d2118774.3%
 
a2016554.1%
 
m1720633.5%
 
f1476393.0%
 
y1043842.1%
 
v799231.6%
 
t764101.5%
 
b760491.5%
 
1753121.5%
 
8753121.5%
 
C656141.3%
 
<546451.1%
 
H532481.1%
 
S478691.0%
 
p427220.9%
 
Other values (10)976172.0%
 

detailed_household_summary_in_household
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Householder
75475 
Child under 18 never married
50426 
Spouse of householder
41709 
Child 18 or older
14430 
Other relative of householder
9703 
Other values (3)
7780 
ValueCountFrequency (%) 
Householder7547537.8%
 
Child under 18 never married5042625.3%
 
Spouse of householder4170920.9%
 
Child 18 or older144307.2%
 
Other relative of householder97034.9%
 
Nonrelative of householder76013.8%
 
Group Quarters- Secondary individual1320.1%
 
Child under 18 ever married47< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length36
Median length21
Mean length19.28793172
Min length11

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e57158214.9%
 
o40642310.6%
 
r39277510.2%
 
3733079.7%
 
d3151638.2%
 
h2681077.0%
 
l2312576.0%
 
u2270665.9%
 
s1763294.6%
 
i1330763.5%
 
n1087642.8%
 
H754752.0%
 
a681731.8%
 
v679091.8%
 
C649031.7%
 
1649031.7%
 
8649031.7%
 
f590131.5%
 
m504731.3%
 
S418411.1%
 
p418411.1%
 
t271390.7%
 
O97030.3%
 
N76010.2%
 
G132< 0.1%
 
Other values (4)528< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter314535481.7%
 
Space Separator3733079.7%
 
Uppercase Letter1997875.2%
 
Decimal Number1298063.4%
 
Dash Punctuation132< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
H7547537.8%
 
C6490332.5%
 
S4184120.9%
 
O97034.9%
 
N76013.8%
 
G1320.1%
 
Q1320.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e57158218.2%
 
o40642312.9%
 
r39277512.5%
 
d31516310.0%
 
h2681078.5%
 
l2312577.4%
 
u2270667.2%
 
s1763295.6%
 
i1330764.2%
 
n1087643.5%
 
a681732.2%
 
v679092.2%
 
f590131.9%
 
m504731.6%
 
p418411.3%
 
t271390.9%
 
c132< 0.1%
 
y132< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
373307100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
16490350.0%
 
86490350.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-132100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin334514186.9%
 
Common50324513.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e57158217.1%
 
o40642312.1%
 
r39277511.7%
 
d3151639.4%
 
h2681078.0%
 
l2312576.9%
 
u2270666.8%
 
s1763295.3%
 
i1330764.0%
 
n1087643.3%
 
H754752.3%
 
a681732.0%
 
v679092.0%
 
C649031.9%
 
f590131.8%
 
m504731.5%
 
S418411.3%
 
p418411.3%
 
t271390.8%
 
O97030.3%
 
N76010.2%
 
G132< 0.1%
 
Q132< 0.1%
 
c132< 0.1%
 
y132< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
37330774.2%
 
16490312.9%
 
86490312.9%
 
-132< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3848386100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e57158214.9%
 
o40642310.6%
 
r39277510.2%
 
3733079.7%
 
d3151638.2%
 
h2681077.0%
 
l2312576.0%
 
u2270665.9%
 
s1763294.6%
 
i1330763.5%
 
n1087642.8%
 
H754752.0%
 
a681731.8%
 
v679091.8%
 
C649031.7%
 
1649031.7%
 
8649031.7%
 
f590131.5%
 
m504731.3%
 
S418411.1%
 
p418411.1%
 
t271390.7%
 
O97030.3%
 
N76010.2%
 
G132< 0.1%
 
Other values (4)528< 0.1%
 

instance_weight
Real number (ℝ≥0)

Distinct99800
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1740.380269
Minimum37.87
Maximum18656.3
Zeros0
Zeros (%)0.0%
Memory size1.5 MiB

Quantile statistics

Minimum37.87
5-th percentile395.342
Q11061.615
median1618.31
Q32188.61
95-th percentile3585.909
Maximum18656.3
Range18618.43
Interquartile range (IQR)1126.995

Descriptive statistics

Standard deviation993.7681558
Coefficient of variation (CV)0.5710063331
Kurtosis5.412514036
Mean1740.380269
Median Absolute Deviation (MAD)561.46
Skewness1.432733152
Sum347245892.5
Variance987575.1475
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1601.432< 0.1%
 
753.2332< 0.1%
 
1191.2132< 0.1%
 
1787.3432< 0.1%
 
707.931< 0.1%
 
1317.5131< 0.1%
 
1070.1530< 0.1%
 
1839.1928< 0.1%
 
1002.0228< 0.1%
 
1009.3928< 0.1%
 
1033.8328< 0.1%
 
1029.7327< 0.1%
 
1122.627< 0.1%
 
1528.8427< 0.1%
 
964.526< 0.1%
 
1244.6626< 0.1%
 
1011.7126< 0.1%
 
1155.226< 0.1%
 
988.7926< 0.1%
 
1882.9626< 0.1%
 
974.0126< 0.1%
 
1218.8226< 0.1%
 
1138.1925< 0.1%
 
1739.8925< 0.1%
 
1032.8225< 0.1%
 
Other values (99775)19882799.7%
 
ValueCountFrequency (%) 
37.871< 0.1%
 
39.111< 0.1%
 
40.672< 0.1%
 
42.822< 0.1%
 
43.263< 0.1%
 
45.742< 0.1%
 
47.836< 0.1%
 
49.822< 0.1%
 
52.431< 0.1%
 
52.464< 0.1%
 
ValueCountFrequency (%) 
18656.31< 0.1%
 
16349.21< 0.1%
 
13911.51< 0.1%
 
13145.11< 0.1%
 
13114.21< 0.1%
 
12960.21< 0.1%
 
12399.91< 0.1%
 
12184.51< 0.1%
 
11958.41< 0.1%
 
118631< 0.1%
 

migration_code-change_in_msa
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Nonmover
82538 
MSA to MSA
10601 
NonMSA to nonMSA
 
2811
Not in universe
 
1516
Other values (5)
 
2361
ValueCountFrequency (%) 
?9969650.0%
 
Nonmover8253841.4%
 
MSA to MSA106015.3%
 
NonMSA to nonMSA28111.4%
 
Not in universe15160.8%
 
MSA to nonMSA7900.4%
 
NonMSA to MSA6150.3%
 
Abroad to MSA4530.2%
 
Not identifiable4300.2%
 
Abroad to nonMSA73< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length16
Median length8
Mean length4.841186229
Min length1

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o18999119.7%
 
?9969610.3%
 
n9677410.0%
 
N879109.1%
 
e864308.9%
 
r845808.8%
 
v840548.7%
 
m825388.5%
 
341483.5%
 
A306863.2%
 
M301603.1%
 
S301603.1%
 
t177191.8%
 
i43220.4%
 
u15160.2%
 
s15160.2%
 
d9560.1%
 
a9560.1%
 
b9560.1%
 
f430< 0.1%
 
l430< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter65316867.6%
 
Uppercase Letter17891618.5%
 
Other Punctuation9969610.3%
 
Space Separator341483.5%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?99696100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N8791049.1%
 
A3068617.2%
 
M3016016.9%
 
S3016016.9%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
34148100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o18999129.1%
 
n9677414.8%
 
e8643013.2%
 
r8458012.9%
 
v8405412.9%
 
m8253812.6%
 
t177192.7%
 
i43220.7%
 
u15160.2%
 
s15160.2%
 
d9560.1%
 
a9560.1%
 
b9560.1%
 
f4300.1%
 
l4300.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin83208486.1%
 
Common13384413.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
?9969674.5%
 
3414825.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o18999122.8%
 
n9677411.6%
 
N8791010.6%
 
e8643010.4%
 
r8458010.2%
 
v8405410.1%
 
m825389.9%
 
A306863.7%
 
M301603.6%
 
S301603.6%
 
t177192.1%
 
i43220.5%
 
u15160.2%
 
s15160.2%
 
d9560.1%
 
a9560.1%
 
b9560.1%
 
f4300.1%
 
l4300.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII965928100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o18999119.7%
 
?9969610.3%
 
n9677410.0%
 
N879109.1%
 
e864308.9%
 
r845808.8%
 
v840548.7%
 
m825388.5%
 
341483.5%
 
A306863.2%
 
M301603.1%
 
S301603.1%
 
t177191.8%
 
i43220.4%
 
u15160.2%
 
s15160.2%
 
d9560.1%
 
a9560.1%
 
b9560.1%
 
f430< 0.1%
 
l430< 0.1%
 

migration_code-change_in_reg
Categorical

HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Nonmover
82538 
Same county
 
9812
Different county same state
 
2797
Not in universe
 
1516
Other values (4)
 
3164
ValueCountFrequency (%) 
?9969650.0%
 
Nonmover8253841.4%
 
Same county98124.9%
 
Different county same state27971.4%
 
Not in universe15160.8%
 
Different region11780.6%
 
Different state same division9910.5%
 
Abroad5300.3%
 
Different division same region4650.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length30
Median length6
Mean length5.166862968
Min length1

Overview of Unicode Properties

Unique unicode characters23
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o18283017.7%
 
e11592811.2%
 
n10670910.4%
 
?996969.7%
 
m966039.4%
 
r916588.9%
 
v855108.3%
 
N840548.2%
 
t271322.6%
 
267812.6%
 
a183831.8%
 
i144741.4%
 
u141251.4%
 
c126091.2%
 
y126091.2%
 
s110131.1%
 
f108621.1%
 
S98121.0%
 
D54310.5%
 
d19860.2%
 
g16430.2%
 
A5300.1%
 
b5300.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter80460478.0%
 
Uppercase Letter998279.7%
 
Other Punctuation996969.7%
 
Space Separator267812.6%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?99696100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N8405484.2%
 
S98129.8%
 
D54315.4%
 
A5300.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o18283022.7%
 
e11592814.4%
 
n10670913.3%
 
m9660312.0%
 
r9165811.4%
 
v8551010.6%
 
t271323.4%
 
a183832.3%
 
i144741.8%
 
u141251.8%
 
c126091.6%
 
y126091.6%
 
s110131.4%
 
f108621.3%
 
d19860.2%
 
g16430.2%
 
b5300.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
26781100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin90443187.7%
 
Common12647712.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
?9969678.8%
 
2678121.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o18283020.2%
 
e11592812.8%
 
n10670911.8%
 
m9660310.7%
 
r9165810.1%
 
v855109.5%
 
N840549.3%
 
t271323.0%
 
a183832.0%
 
i144741.6%
 
u141251.6%
 
c126091.4%
 
y126091.4%
 
s110131.2%
 
f108621.2%
 
S98121.1%
 
D54310.6%
 
d19860.2%
 
g16430.2%
 
A5300.1%
 
b5300.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1030908100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o18283017.7%
 
e11592811.2%
 
n10670910.4%
 
?996969.7%
 
m966039.4%
 
r916588.9%
 
v855108.3%
 
N840548.2%
 
t271322.6%
 
267812.6%
 
a183831.8%
 
i144741.4%
 
u141251.4%
 
c126091.2%
 
y126091.2%
 
s110131.1%
 
f108621.1%
 
S98121.0%
 
D54310.5%
 
d19860.2%
 
g16430.2%
 
A5300.1%
 
b5300.1%
 

migration_code-move_within_reg
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Nonmover
82538 
Same county
 
9812
Different county same state
 
2797
Not in universe
 
1516
Other values (5)
 
3164
ValueCountFrequency (%) 
?9969650.0%
 
Nonmover8253841.4%
 
Same county98124.9%
 
Different county same state27971.4%
 
Not in universe15160.8%
 
Different state in South9730.5%
 
Different state in West6790.3%
 
Different state in Midwest5510.3%
 
Abroad5300.3%
 
Different state in Northeast4310.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length28
Median length6
Mean length5.186038702
Min length1

Overview of Unicode Properties

Unique unicode characters26
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
o18113517.5%
 
e11613311.2%
 
n10624410.3%
 
?996969.6%
 
m951479.2%
 
r904468.7%
 
N844858.2%
 
v840548.1%
 
t334833.2%
 
291372.8%
 
a190011.8%
 
u150981.5%
 
c126091.2%
 
y126091.2%
 
i116481.1%
 
s114051.1%
 
f108621.0%
 
S107851.0%
 
D54310.5%
 
h14040.1%
 
d10810.1%
 
W6790.1%
 
M5510.1%
 
w5510.1%
 
A5300.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter80344077.6%
 
Uppercase Letter1024619.9%
 
Other Punctuation996969.6%
 
Space Separator291372.8%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?99696100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N8448582.5%
 
S1078510.5%
 
D54315.3%
 
W6790.7%
 
M5510.5%
 
A5300.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o18113522.5%
 
e11613314.5%
 
n10624413.2%
 
m9514711.8%
 
r9044611.3%
 
v8405410.5%
 
t334834.2%
 
a190012.4%
 
u150981.9%
 
c126091.6%
 
y126091.6%
 
i116481.4%
 
s114051.4%
 
f108621.4%
 
h14040.2%
 
d10810.1%
 
w5510.1%
 
b5300.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
29137100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin90590187.5%
 
Common12883312.5%
 

Most frequent Common characters

ValueCountFrequency (%) 
?9969677.4%
 
2913722.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
o18113520.0%
 
e11613312.8%
 
n10624411.7%
 
m9514710.5%
 
r9044610.0%
 
N844859.3%
 
v840549.3%
 
t334833.7%
 
a190012.1%
 
u150981.7%
 
c126091.4%
 
y126091.4%
 
i116481.3%
 
s114051.3%
 
f108621.2%
 
S107851.2%
 
D54310.6%
 
h14040.2%
 
d10810.1%
 
W6790.1%
 
M5510.1%
 
w5510.1%
 
A5300.1%
 
b5300.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1034734100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
o18113517.5%
 
e11613311.2%
 
n10624410.3%
 
?996969.6%
 
m951479.2%
 
r904468.7%
 
N844858.2%
 
v840548.1%
 
t334833.2%
 
291372.8%
 
a190011.8%
 
u150981.5%
 
c126091.2%
 
y126091.2%
 
i116481.1%
 
s114051.1%
 
f108621.0%
 
S107851.0%
 
D54310.5%
 
h14040.1%
 
d10810.1%
 
W6790.1%
 
M5510.1%
 
w5510.1%
 
A5300.1%
 

live_in_this_house_1_year_ago
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe under 1 year old
101212 
Yes
82538 
No
15773 
ValueCountFrequency (%) 
Not in universe under 1 year old10121250.7%
 
Yes8253841.4%
 
No157737.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length32
Median length32
Mean length17.63177178
Min length2

Overview of Unicode Properties

Unique unicode characters17
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
60727217.3%
 
e48738613.9%
 
n3036368.6%
 
r3036368.6%
 
o2181976.2%
 
i2024245.8%
 
u2024245.8%
 
d2024245.8%
 
s1837505.2%
 
N1169853.3%
 
t1012122.9%
 
v1012122.9%
 
11012122.9%
 
y1012122.9%
 
a1012122.9%
 
l1012122.9%
 
Y825382.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter260993774.2%
 
Space Separator60727217.3%
 
Uppercase Letter1995235.7%
 
Decimal Number1012122.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N11698558.6%
 
Y8253841.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e48738618.7%
 
n30363611.6%
 
r30363611.6%
 
o2181978.4%
 
i2024247.8%
 
u2024247.8%
 
d2024247.8%
 
s1837507.0%
 
t1012123.9%
 
v1012123.9%
 
y1012123.9%
 
a1012123.9%
 
l1012123.9%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
607272100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1101212100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin280946079.9%
 
Common70848420.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e48738617.3%
 
n30363610.8%
 
r30363610.8%
 
o2181977.8%
 
i2024247.2%
 
u2024247.2%
 
d2024247.2%
 
s1837506.5%
 
N1169854.2%
 
t1012123.6%
 
v1012123.6%
 
y1012123.6%
 
a1012123.6%
 
l1012123.6%
 
Y825382.9%
 

Most frequent Common characters

ValueCountFrequency (%) 
60727285.7%
 
110121214.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3517944100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
60727217.3%
 
e48738613.9%
 
n3036368.6%
 
r3036368.6%
 
o2181976.2%
 
i2024245.8%
 
u2024245.8%
 
d2024245.8%
 
s1837505.2%
 
N1169853.3%
 
t1012122.9%
 
v1012122.9%
 
11012122.9%
 
y1012122.9%
 
a1012122.9%
 
l1012122.9%
 
Y825382.3%
 

migration_prev_res_in_sunbelt
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Not in universe
84054 
No
9987 
Yes
 
5786
ValueCountFrequency (%) 
?9969650.0%
 
Not in universe8405442.1%
 
No99875.0%
 
Yes57862.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length15
Median length2
Mean length7.005899069
Min length1

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e17389412.4%
 
16810812.0%
 
i16810812.0%
 
n16810812.0%
 
?996967.1%
 
N940416.7%
 
o940416.7%
 
s898406.4%
 
t840546.0%
 
u840546.0%
 
v840546.0%
 
r840546.0%
 
Y57860.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter103020773.7%
 
Space Separator16810812.0%
 
Uppercase Letter998277.1%
 
Other Punctuation996967.1%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?99696100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N9404194.2%
 
Y57865.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e17389416.9%
 
i16810816.3%
 
n16810816.3%
 
o940419.1%
 
s898408.7%
 
t840548.2%
 
u840548.2%
 
v840548.2%
 
r840548.2%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
168108100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin113003480.8%
 
Common26780419.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
16810862.8%
 
?9969637.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e17389415.4%
 
i16810814.9%
 
n16810814.9%
 
N940418.3%
 
o940418.3%
 
s898408.0%
 
t840547.4%
 
u840547.4%
 
v840547.4%
 
r840547.4%
 
Y57860.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1397838100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e17389412.4%
 
16810812.0%
 
i16810812.0%
 
n16810812.0%
 
?996967.1%
 
N940416.7%
 
o940416.7%
 
s898406.4%
 
t840546.0%
 
u840546.0%
 
v840546.0%
 
r840546.0%
 
Y57860.4%
 

num_persons_worked_for_employer
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.95618049
Minimum0
Maximum6
Zeros95983
Zeros (%)48.1%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q34
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.365125505
Coefficient of variation (CV)1.209052803
Kurtosis-1.082246833
Mean1.95618049
Median Absolute Deviation (MAD)1
Skewness0.7515606804
Sum390303
Variance5.593818657
MonotocityNot monotonic
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
09598348.1%
 
63651118.3%
 
12310911.6%
 
4143797.2%
 
3134256.7%
 
2100815.1%
 
560353.0%
 
ValueCountFrequency (%) 
09598348.1%
 
12310911.6%
 
2100815.1%
 
3134256.7%
 
4143797.2%
 
560353.0%
 
63651118.3%
 
ValueCountFrequency (%) 
63651118.3%
 
560353.0%
 
4143797.2%
 
3134256.7%
 
2100815.1%
 
12310911.6%
 
09598348.1%
 
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
144232 
Both parents present
38983 
Mother only present
 
12772
Father only present
 
1883
Neither parent present
 
1653
ValueCountFrequency (%) 
Not in universe14423272.3%
 
Both parents present3898319.5%
 
Mother only present127726.4%
 
Father only present18830.9%
 
Neither parent present16530.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length22
Median length15
Mean length16.32869895
Min length15

Overview of Unicode Properties

Unique unicode characters19
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e45764314.0%
 
39904612.2%
 
n39904612.2%
 
t2954509.1%
 
i2901178.9%
 
r2564677.9%
 
s2385067.3%
 
o2106426.5%
 
N1458854.5%
 
u1442324.4%
 
v1442324.4%
 
p959272.9%
 
h552911.7%
 
a425191.3%
 
B389831.2%
 
l146550.4%
 
y146550.4%
 
M127720.4%
 
F18830.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter265938281.6%
 
Space Separator39904612.2%
 
Uppercase Letter1995236.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N14588573.1%
 
B3898319.5%
 
M127726.4%
 
F18830.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e45764317.2%
 
n39904615.0%
 
t29545011.1%
 
i29011710.9%
 
r2564679.6%
 
s2385069.0%
 
o2106427.9%
 
u1442325.4%
 
v1442325.4%
 
p959273.6%
 
h552912.1%
 
a425191.6%
 
l146550.6%
 
y146550.6%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
399046100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin285890587.8%
 
Common39904612.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e45764316.0%
 
n39904614.0%
 
t29545010.3%
 
i29011710.1%
 
r2564679.0%
 
s2385068.3%
 
o2106427.4%
 
N1458855.1%
 
u1442325.0%
 
v1442325.0%
 
p959273.4%
 
h552911.9%
 
a425191.5%
 
B389831.4%
 
l146550.5%
 
y146550.5%
 
M127720.4%
 
F18830.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
399046100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3257951100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e45764314.0%
 
39904612.2%
 
n39904612.2%
 
t2954509.1%
 
i2901178.9%
 
r2564677.9%
 
s2385067.3%
 
o2106426.5%
 
N1458854.5%
 
u1442324.4%
 
v1442324.4%
 
p959272.9%
 
h552911.7%
 
a425191.3%
 
B389831.2%
 
l146550.4%
 
y146550.4%
 
M127720.4%
 
F18830.1%
 
Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
United-States
159163 
Mexico
 
10008
?
 
6713
Puerto-Rico
 
2680
Italy
 
2212
Other values (38)
18747 
ValueCountFrequency (%) 
United-States15916379.8%
 
Mexico100085.0%
 
?67133.4%
 
Puerto-Rico26801.3%
 
Italy22121.1%
 
Canada13800.7%
 
Germany13560.7%
 
Dominican-Republic12900.6%
 
Poland12120.6%
 
Philippines11540.6%
 
Cuba11250.6%
 
El-Salvador9820.5%
 
China8560.4%
 
England7930.4%
 
Columbia6140.3%
 
India5800.3%
 
South Korea5300.3%
 
Ireland5080.3%
 
Jamaica4630.2%
 
Vietnam4570.2%
 
Guatemala4450.2%
 
Japan3920.2%
 
Portugal3880.2%
 
Ecuador3790.2%
 
Haiti3510.2%
 
Other values (18)34921.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length28
Median length13
Mean length11.66875999
Min length1

Overview of Unicode Properties

Unique unicode characters47
Unique unicode categories7 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t48516820.8%
 
e33857314.5%
 
a1858098.0%
 
i1841617.9%
 
n1733127.4%
 
d1660697.1%
 
-1643257.1%
 
S1612406.9%
 
s1609336.9%
 
U1594816.9%
 
o227901.0%
 
c173660.7%
 
l114120.5%
 
M100080.4%
 
x100080.4%
 
u91360.4%
 
r89050.4%
 
?67130.3%
 
P57940.2%
 
m50050.2%
 
C41710.2%
 
y40330.2%
 
p39900.2%
 
R39700.2%
 
I36920.2%
 
Other values (22)221221.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter179660777.2%
 
Uppercase Letter35883815.4%
 
Dash Punctuation1643257.1%
 
Other Punctuation68260.3%
 
Space Separator12720.1%
 
Open Punctuation159< 0.1%
 
Close Punctuation159< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S16124044.9%
 
U15948144.4%
 
M100082.8%
 
P57941.6%
 
C41711.2%
 
R39701.1%
 
I36921.0%
 
G23040.6%
 
E21540.6%
 
D12900.4%
 
H10080.3%
 
J8550.2%
 
K6360.2%
 
V6160.2%
 
T5320.1%
 
N3660.1%
 
Y2170.1%
 
F1910.1%
 
O159< 0.1%
 
L154< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t48516827.0%
 
e33857318.8%
 
a18580910.3%
 
i18416110.3%
 
n1733129.6%
 
d1660699.2%
 
s1609339.0%
 
o227901.3%
 
c173661.0%
 
l114120.6%
 
x100080.6%
 
u91360.5%
 
r89050.5%
 
m50050.3%
 
y40330.2%
 
p39900.2%
 
b33380.2%
 
h26980.2%
 
g25030.1%
 
v11990.1%
 
w199< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-164325100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?671398.3%
 
&1131.7%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1272100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(159100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)159100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin215544592.6%
 
Common1727417.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t48516822.5%
 
e33857315.7%
 
a1858098.6%
 
i1841618.5%
 
n1733128.0%
 
d1660697.7%
 
S1612407.5%
 
s1609337.5%
 
U1594817.4%
 
o227901.1%
 
c173660.8%
 
l114120.5%
 
M100080.5%
 
x100080.5%
 
u91360.4%
 
r89050.4%
 
P57940.3%
 
m50050.2%
 
C41710.2%
 
y40330.2%
 
p39900.2%
 
R39700.2%
 
I36920.2%
 
b33380.2%
 
h26980.1%
 
Other values (16)143830.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
-16432595.1%
 
?67133.9%
 
12720.7%
 
(1590.1%
 
)1590.1%
 
&1130.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2328186100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t48516820.8%
 
e33857314.5%
 
a1858098.0%
 
i1841617.9%
 
n1733127.4%
 
d1660697.1%
 
-1643257.1%
 
S1612406.9%
 
s1609336.9%
 
U1594816.9%
 
o227901.0%
 
c173660.7%
 
l114120.5%
 
M100080.4%
 
x100080.4%
 
u91360.4%
 
r89050.4%
 
?67130.3%
 
P57940.2%
 
m50050.2%
 
C41710.2%
 
y40330.2%
 
p39900.2%
 
R39700.2%
 
I36920.2%
 
Other values (22)221221.0%
 
Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
United-States
160479 
Mexico
 
9781
?
 
6119
Puerto-Rico
 
2473
Italy
 
1844
Other values (38)
18827 
ValueCountFrequency (%) 
United-States16047980.4%
 
Mexico97814.9%
 
?61193.1%
 
Puerto-Rico24731.2%
 
Italy18440.9%
 
Canada14510.7%
 
Germany13820.7%
 
Philippines12310.6%
 
Poland11100.6%
 
Cuba11080.6%
 
El-Salvador11080.6%
 
Dominican-Republic11030.6%
 
England9030.5%
 
China7600.4%
 
Columbia6120.3%
 
South Korea6090.3%
 
Ireland5990.3%
 
India5810.3%
 
Vietnam4730.2%
 
Japan4690.2%
 
Jamaica4530.2%
 
Guatemala4440.2%
 
Ecuador3750.2%
 
Peru3550.2%
 
Haiti3530.2%
 
Other values (18)33481.7%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length28
Median length13
Mean length11.72127023
Min length1

Overview of Unicode Properties

Unique unicode characters47
Unique unicode categories7 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t48857920.9%
 
e34065814.6%
 
a1870618.0%
 
i1845567.9%
 
n1746587.5%
 
d1676417.2%
 
-1653697.1%
 
S1627517.0%
 
s1623096.9%
 
U1607936.9%
 
o220040.9%
 
c164600.7%
 
l112000.5%
 
M97810.4%
 
x97810.4%
 
r88780.4%
 
u87280.4%
 
?61190.3%
 
P55430.2%
 
m48130.2%
 
C40880.2%
 
p40340.2%
 
y36800.2%
 
R35760.2%
 
I33790.1%
 
Other values (22)222241.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter180488877.2%
 
Uppercase Letter36053015.4%
 
Dash Punctuation1653697.1%
 
Other Punctuation62180.3%
 
Space Separator13440.1%
 
Open Punctuation157< 0.1%
 
Close Punctuation157< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S16275145.1%
 
U16079344.6%
 
M97812.7%
 
P55431.5%
 
C40881.1%
 
R35761.0%
 
I33790.9%
 
E23860.7%
 
G22440.6%
 
D11030.3%
 
H10240.3%
 
J9220.3%
 
K7160.2%
 
V6300.2%
 
T5430.2%
 
N3500.1%
 
F2120.1%
 
Y177< 0.1%
 
O157< 0.1%
 
L155< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t48857927.1%
 
e34065818.9%
 
a18706110.4%
 
i18455610.2%
 
n1746589.7%
 
d1676419.3%
 
s1623099.0%
 
o220041.2%
 
c164600.9%
 
l112000.6%
 
x97810.5%
 
r88780.5%
 
u87280.5%
 
m48130.3%
 
p40340.2%
 
y36800.2%
 
b30790.2%
 
h27720.2%
 
g24900.1%
 
v12850.1%
 
w222< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-165369100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?611998.4%
 
&991.6%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1344100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(157100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)157100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin216541892.6%
 
Common1732457.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t48857922.6%
 
e34065815.7%
 
a1870618.6%
 
i1845568.5%
 
n1746588.1%
 
d1676417.7%
 
S1627517.5%
 
s1623097.5%
 
U1607937.4%
 
o220041.0%
 
c164600.8%
 
l112000.5%
 
M97810.5%
 
x97810.5%
 
r88780.4%
 
u87280.4%
 
P55430.3%
 
m48130.2%
 
C40880.2%
 
p40340.2%
 
y36800.2%
 
R35760.2%
 
I33790.2%
 
b30790.1%
 
h27720.1%
 
Other values (16)146160.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
-16536995.5%
 
?61193.5%
 
13440.8%
 
(1570.1%
 
)1570.1%
 
&990.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2338663100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t48857920.9%
 
e34065814.6%
 
a1870618.0%
 
i1845567.9%
 
n1746587.5%
 
d1676417.2%
 
-1653697.1%
 
S1627517.0%
 
s1623096.9%
 
U1607936.9%
 
o220040.9%
 
c164600.7%
 
l112000.5%
 
M97810.4%
 
x97810.4%
 
r88780.4%
 
u87280.4%
 
?61190.3%
 
P55430.2%
 
m48130.2%
 
C40880.2%
 
p40340.2%
 
y36800.2%
 
R35760.2%
 
I33790.1%
 
Other values (22)222241.0%
 
Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
United-States
176989 
Mexico
 
5767
?
 
3393
Puerto-Rico
 
1400
Germany
 
851
Other values (38)
 
11123
ValueCountFrequency (%) 
United-States17698988.7%
 
Mexico57672.9%
 
?33931.7%
 
Puerto-Rico14000.7%
 
Germany8510.4%
 
Philippines8450.4%
 
Cuba8370.4%
 
Canada7000.4%
 
Dominican-Republic6900.3%
 
El-Salvador6890.3%
 
China4780.2%
 
South Korea4710.2%
 
England4570.2%
 
Columbia4340.2%
 
Italy4190.2%
 
India4080.2%
 
Vietnam3910.2%
 
Poland3810.2%
 
Guatemala3440.2%
 
Japan3390.2%
 
Jamaica3200.2%
 
Peru2680.1%
 
Ecuador2580.1%
 
Haiti2280.1%
 
Nicaragua2180.1%
 
Other values (18)19481.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length28
Median length13
Mean length12.27975722
Min length1

Overview of Unicode Properties

Unique unicode characters47
Unique unicode categories7 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t53473021.8%
 
e36586714.9%
 
a1924817.9%
 
i1921267.8%
 
n1851607.6%
 
d1806227.4%
 
-1799107.3%
 
S1784627.3%
 
s1781727.3%
 
U1772277.2%
 
o129750.5%
 
c98050.4%
 
M57670.2%
 
x57670.2%
 
l56760.2%
 
u56210.2%
 
r52010.2%
 
?33930.1%
 
m32720.1%
 
P30960.1%
 
p27190.1%
 
C25440.1%
 
b21220.1%
 
R20900.1%
 
h19300.1%
 
Other values (22)133590.5%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter188804977.1%
 
Uppercase Letter37739115.4%
 
Dash Punctuation1799107.3%
 
Other Punctuation34590.1%
 
Space Separator1047< 0.1%
 
Open Punctuation119< 0.1%
 
Close Punctuation119< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S17846247.3%
 
U17722747.0%
 
M57671.5%
 
P30960.8%
 
C25440.7%
 
R20900.6%
 
G14610.4%
 
E14040.4%
 
I12380.3%
 
D6900.2%
 
J6590.2%
 
H5740.2%
 
K5710.2%
 
V5100.1%
 
T4460.1%
 
N2410.1%
 
F121< 0.1%
 
O119< 0.1%
 
L105< 0.1%
 
Y66< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t53473028.3%
 
e36586719.4%
 
a19248110.2%
 
i19212610.2%
 
n1851609.8%
 
d1806229.6%
 
s1781729.4%
 
o129750.7%
 
c98050.5%
 
x57670.3%
 
l56760.3%
 
u56210.3%
 
r52010.3%
 
m32720.2%
 
p27190.1%
 
b21220.1%
 
h19300.1%
 
y14680.1%
 
g13790.1%
 
v755< 0.1%
 
w201< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-179910100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
?339398.1%
 
&661.9%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1047100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(119100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)119100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin226544092.5%
 
Common1846547.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t53473023.6%
 
e36586716.1%
 
a1924818.5%
 
i1921268.5%
 
n1851608.2%
 
d1806228.0%
 
S1784627.9%
 
s1781727.9%
 
U1772277.8%
 
o129750.6%
 
c98050.4%
 
M57670.3%
 
x57670.3%
 
l56760.3%
 
u56210.2%
 
r52010.2%
 
m32720.1%
 
P30960.1%
 
p27190.1%
 
C25440.1%
 
b21220.1%
 
R20900.1%
 
h19300.1%
 
y14680.1%
 
G14610.1%
 
Other values (16)90790.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
-17991097.4%
 
?33931.8%
 
10470.6%
 
(1190.1%
 
)1190.1%
 
&66< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2450094100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t53473021.8%
 
e36586714.9%
 
a1924817.9%
 
i1921267.8%
 
n1851607.6%
 
d1806227.4%
 
-1799107.3%
 
S1784627.3%
 
s1781727.3%
 
U1772277.2%
 
o129750.5%
 
c98050.4%
 
M57670.2%
 
x57670.2%
 
l56760.2%
 
u56210.2%
 
r52010.2%
 
?33930.1%
 
m32720.1%
 
P30960.1%
 
p27190.1%
 
C25440.1%
 
b21220.1%
 
R20900.1%
 
h19300.1%
 
Other values (22)133590.5%
 

citizenship
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Native- Born in the United States
176992 
Foreign born- Not a citizen of U S
 
13401
Foreign born- U S citizen by naturalization
 
5855
Native- Born abroad of American Parent(s)
 
1756
Native- Born in Puerto Rico or U S Outlying
 
1519
ValueCountFrequency (%) 
Native- Born in the United States17699288.7%
 
Foreign born- Not a citizen of U S134016.7%
 
Foreign born- U S citizen by naturalization58552.9%
 
Native- Born abroad of American Parent(s)17560.9%
 
Native- Born in Puerto Rico or U S Outlying15190.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length43
Median length33
Mean length33.50715456
Min length33

Overview of Unicode Properties

Unique unicode characters33
Unique unicode categories6 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
103482915.5%
 
t93739614.0%
 
e75478611.3%
 
n6102799.1%
 
i6100429.1%
 
a3952495.9%
 
o2595053.9%
 
r2329403.5%
 
-1995233.0%
 
U1977673.0%
 
S1977673.0%
 
N1936682.9%
 
v1802672.7%
 
B1802672.7%
 
d1787482.7%
 
s1787482.7%
 
h1769922.6%
 
b268670.4%
 
z251110.4%
 
c225310.3%
 
g207750.3%
 
F192560.3%
 
f151570.2%
 
u88930.1%
 
y73740.1%
 
Other values (8)207110.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter465079069.6%
 
Space Separator103482915.5%
 
Uppercase Letter79679411.9%
 
Dash Punctuation1995233.0%
 
Open Punctuation1756< 0.1%
 
Close Punctuation1756< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
U19776724.8%
 
S19776724.8%
 
N19366824.3%
 
B18026722.6%
 
F192562.4%
 
P32750.4%
 
A17560.2%
 
R15190.2%
 
O15190.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t93739620.2%
 
e75478616.2%
 
n61027913.1%
 
i61004213.1%
 
a3952498.5%
 
o2595055.6%
 
r2329405.0%
 
v1802673.9%
 
d1787483.8%
 
s1787483.8%
 
h1769923.8%
 
b268670.6%
 
z251110.5%
 
c225310.5%
 
g207750.4%
 
f151570.3%
 
u88930.2%
 
y73740.2%
 
l73740.2%
 
m1756< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-199523100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1034829100.0%
 

Most frequent Open Punctuation characters

ValueCountFrequency (%) 
(1756100.0%
 

Most frequent Close Punctuation characters

ValueCountFrequency (%) 
)1756100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin544758481.5%
 
Common123786418.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t93739617.2%
 
e75478613.9%
 
n61027911.2%
 
i61004211.2%
 
a3952497.3%
 
o2595054.8%
 
r2329404.3%
 
U1977673.6%
 
S1977673.6%
 
N1936683.6%
 
v1802673.3%
 
B1802673.3%
 
d1787483.3%
 
s1787483.3%
 
h1769923.2%
 
b268670.5%
 
z251110.5%
 
c225310.4%
 
g207750.4%
 
F192560.4%
 
f151570.3%
 
u88930.2%
 
y73740.1%
 
l73740.1%
 
P32750.1%
 
Other values (4)65500.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
103482983.6%
 
-19952316.1%
 
(17560.1%
 
)17560.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6685448100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
103482915.5%
 
t93739614.0%
 
e75478611.3%
 
n6102799.1%
 
i6100429.1%
 
a3952495.9%
 
o2595053.9%
 
r2329403.5%
 
-1995233.0%
 
U1977673.0%
 
S1977673.0%
 
N1936682.9%
 
v1802672.7%
 
B1802672.7%
 
d1787482.7%
 
s1787482.7%
 
h1769922.6%
 
b268670.4%
 
z251110.4%
 
c225310.3%
 
g207750.3%
 
F192560.3%
 
f151570.2%
 
u88930.1%
 
y73740.1%
 
Other values (8)207110.3%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
180672 
2
 
16153
1
 
2698
ValueCountFrequency (%) 
018067290.6%
 
2161538.1%
 
126981.4%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
018067290.6%
 
2161538.1%
 
126981.4%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number199523100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
018067290.6%
 
2161538.1%
 
126981.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Common199523100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
018067290.6%
 
2161538.1%
 
126981.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII199523100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
018067290.6%
 
2161538.1%
 
126981.4%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
197539 
No
 
1593
Yes
 
391
ValueCountFrequency (%) 
Not in universe19753999.0%
 
No15930.8%
 
Yes3910.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length15
Median length15
Mean length14.87269137
Min length2

Overview of Unicode Properties

Unique unicode characters12
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e39546913.3%
 
39507813.3%
 
i39507813.3%
 
n39507813.3%
 
N1991326.7%
 
o1991326.7%
 
s1979306.7%
 
t1975396.7%
 
u1975396.7%
 
v1975396.7%
 
r1975396.7%
 
Y391< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter237284380.0%
 
Space Separator39507813.3%
 
Uppercase Letter1995236.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
N19913299.8%
 
Y3910.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e39546916.7%
 
i39507816.6%
 
n39507816.6%
 
o1991328.4%
 
s1979308.3%
 
t1975398.3%
 
u1975398.3%
 
v1975398.3%
 
r1975398.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
395078100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin257236686.7%
 
Common39507813.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e39546915.4%
 
i39507815.4%
 
n39507815.4%
 
N1991327.7%
 
o1991327.7%
 
s1979307.7%
 
t1975397.7%
 
u1975397.7%
 
v1975397.7%
 
r1975397.7%
 
Y391< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
395078100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2967444100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e39546913.3%
 
39507813.3%
 
i39507813.3%
 
n39507813.3%
 
N1991326.7%
 
o1991326.7%
 
s1979306.7%
 
t1975396.7%
 
u1975396.7%
 
v1975396.7%
 
r1975396.7%
 
Y391< 0.1%
 
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
2
150130 
0
47409 
1
 
1984
ValueCountFrequency (%) 
215013075.2%
 
04740923.8%
 
119841.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
215013075.2%
 
04740923.8%
 
119841.0%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number199523100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
215013075.2%
 
04740923.8%
 
119841.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common199523100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
215013075.2%
 
04740923.8%
 
119841.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII199523100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
215013075.2%
 
04740923.8%
 
119841.0%
 

weeks_worked_in_year
Real number (ℝ≥0)

ZEROS

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.17489713
Minimum0
Maximum52
Zeros95983
Zeros (%)48.1%
Memory size1.5 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q352
95-th percentile52
Maximum52
Range52
Interquartile range (IQR)52

Descriptive statistics

Standard deviation24.41148817
Coefficient of variation (CV)1.053359073
Kurtosis-1.863805826
Mean23.17489713
Median Absolute Deviation (MAD)8
Skewness0.2101693419
Sum4623925
Variance595.9207546
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
09598348.1%
 
527031435.2%
 
4027901.4%
 
5023041.2%
 
2622681.1%
 
4818060.9%
 
1217800.9%
 
3013780.7%
 
2013300.7%
 
811260.6%
 
3611080.6%
 
169450.5%
 
328830.4%
 
448450.4%
 
518190.4%
 
247670.4%
 
47570.4%
 
467080.4%
 
357040.4%
 
106940.3%
 
456690.3%
 
66460.3%
 
396020.3%
 
425730.3%
 
285680.3%
 
Other values (28)71563.6%
 
ValueCountFrequency (%) 
09598348.1%
 
14640.2%
 
24580.2%
 
34170.2%
 
47570.4%
 
53090.2%
 
66460.3%
 
71520.1%
 
811260.6%
 
92390.1%
 
ValueCountFrequency (%) 
527031435.2%
 
518190.4%
 
5023041.2%
 
495090.3%
 
4818060.9%
 
472780.1%
 
467080.4%
 
456690.3%
 
448450.4%
 
433740.2%
 

year
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
94
99827 
95
99696 
ValueCountFrequency (%) 
949982750.0%
 
959969650.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
919952350.0%
 
49982725.0%
 
59969625.0%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number399046100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
919952350.0%
 
49982725.0%
 
59969625.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common399046100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
919952350.0%
 
49982725.0%
 
59969625.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII399046100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
919952350.0%
 
49982725.0%
 
59969625.0%
 

income
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
- 50000.
187141 
50000+.
 
12382
ValueCountFrequency (%) 
- 50000.18714193.8%
 
50000+.123826.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length7.937941992
Min length7

Overview of Unicode Properties

Unique unicode characters6
Unique unicode categories5 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
079809250.4%
 
519952312.6%
 
.19952312.6%
 
-18714111.8%
 
18714111.8%
 
+123820.8%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number99761563.0%
 
Other Punctuation19952312.6%
 
Dash Punctuation18714111.8%
 
Space Separator18714111.8%
 
Math Symbol123820.8%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-187141100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
187141100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
079809280.0%
 
519952320.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
.199523100.0%
 

Most frequent Math Symbol characters

ValueCountFrequency (%) 
+12382100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common1583802100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
079809250.4%
 
519952312.6%
 
.19952312.6%
 
-18714111.8%
 
18714111.8%
 
+123820.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1583802100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
079809250.4%
 
519952312.6%
 
.19952312.6%
 
-18714111.8%
 
18714111.8%
 
+123820.8%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

ageclass_of_workerdetailed_industry_recodedetailed_occupation_recodeeducationwage_per_hourenroll_in_edu_inst_last_wkmarital_statmajor_industry_codemajor_occupation_coderacehispanic_originsexmember_of_a_labor_unionreason_for_unemploymentfull_or_part_time_employment_statcapital_gainscapital_lossesdividends_from_stockstax_filer_statregion_of_previous_residencestate_of_previous_residencedetailed_household_and_family_statdetailed_household_summary_in_householdinstance_weightmigration_code-change_in_msamigration_code-change_in_regmigration_code-move_within_reglive_in_this_house_1_year_agomigration_prev_res_in_sunbeltnum_persons_worked_for_employerfamily_members_under_18country_of_birth_fathercountry_of_birth_mothercountry_of_birth_selfcitizenshipown_business_or_self_employedfill_inc_questionnaire_for_veteran's_adminveterans_benefitsweeks_worked_in_yearyearincome
073Not in universe00High school graduate0Not in universeWidowedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeOther Rel 18+ ever marr not in subfamilyOther relative of householder1700.09???Not in universe under 1 year old?0Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe2095- 50000.
158Self-employed-not incorporated434Some college but no degree0Not in universeDivorcedConstructionPrecision production craft & repairWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000Head of householdSouthArkansasHouseholderHouseholder1053.55MSA to MSASame countySame countyNoYes1Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe25294- 50000.
218Not in universe0010th grade0High schoolNever marriedNot in universe or childrenNot in universeAsian or Pacific IslanderAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeChild 18+ never marr Not in a subfamilyChild 18 or older991.95???Not in universe under 1 year old?0Not in universeVietnamVietnamVietnamForeign born- Not a citizen of U S0Not in universe2095- 50000.
39Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1758.14NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0094- 50000.
410Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1069.16NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0094- 50000.
548Private4010Some college but no degree1200Not in universeMarried-civilian spouse presentEntertainmentProfessional specialtyAmer Indian Aleut or EskimoAll otherFemaleNoNot in universeFull-time schedules000Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder162.61???Not in universe under 1 year old?1Not in universePhilippinesUnited-StatesUnited-StatesNative- Born in the United States2Not in universe25295- 50000.
642Private343Bachelors degree(BA AB BS)0Not in universeMarried-civilian spouse presentFinance insurance and real estateExecutive admin and managerialWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces517800Joint both under 65Not in universeNot in universeHouseholderHouseholder1535.86NonmoverNonmoverNonmoverYesNot in universe6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe25294- 50000.
728Private440High school graduate0Not in universeNever marriedConstructionHandlers equip cleaners etcWhiteAll otherFemaleNot in universeJob loser - on layoffUnemployed full-time000SingleNot in universeNot in universeSecondary individualNonrelative of householder898.83???Not in universe under 1 year old?4Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe23095- 50000.
847Local government4326Some college but no degree876Not in universeMarried-civilian spouse presentEducationAdm support including clericalWhiteAll otherFemaleNoNot in universeFull-time schedules000Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder1661.53???Not in universe under 1 year old?5Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe25295- 50000.
934Private437Some college but no degree0Not in universeMarried-civilian spouse presentConstructionMachine operators assmblrs & inspctrsWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000Joint both under 65Not in universeNot in universeHouseholderHouseholder1146.79NonmoverNonmoverNonmoverYesNot in universe6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe25294- 50000.

Last rows

ageclass_of_workerdetailed_industry_recodedetailed_occupation_recodeeducationwage_per_hourenroll_in_edu_inst_last_wkmarital_statmajor_industry_codemajor_occupation_coderacehispanic_originsexmember_of_a_labor_unionreason_for_unemploymentfull_or_part_time_employment_statcapital_gainscapital_lossesdividends_from_stockstax_filer_statregion_of_previous_residencestate_of_previous_residencedetailed_household_and_family_statdetailed_household_summary_in_householdinstance_weightmigration_code-change_in_msamigration_code-change_in_regmigration_code-move_within_reglive_in_this_house_1_year_agomigration_prev_res_in_sunbeltnum_persons_worked_for_employerfamily_members_under_18country_of_birth_fathercountry_of_birth_mothercountry_of_birth_selfcitizenshipown_business_or_self_employedfill_inc_questionnaire_for_veteran's_adminveterans_benefitsweeks_worked_in_yearyearincome
19951357Private9379th grade0Not in universeDivorcedManufacturing-durable goodsMachine operators assmblrs & inspctrsWhiteCentral or South AmericanFemaleNot in universeNot in universeFull-time schedules000SingleNot in universeNot in universeHouseholderHouseholder743.66???Not in universe under 1 year old?4Not in universeDominican-RepublicDominican-RepublicDominican-RepublicForeign born- Not a citizen of U S0Not in universe25295- 50000.
19951451Private331910th grade0Not in universeWidowedRetail tradeSalesWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000SingleSouthNorth DakotaHouseholderHouseholder1302.34NonMSA to nonMSASame countySame countyNoYes6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe25294- 50000.
19951587Not in universe00High school graduate0Not in universeWidowedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeNot in labor force000SingleNot in universeNot in universeNonfamily householderHouseholder3255.80???Not in universe under 1 year old?0Not in universe?United-StatesUnited-StatesNative- Born in the United States0Not in universe2095- 50000.
1995163Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeBlackAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerSouthUtahChild under 18 of RP of unrel subfamilyNonrelative of householder2733.75MSA to MSASame countySame countyNoYes0Mother only presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0094- 50000.
19951739Private4326Bachelors degree(BA AB BS)0Not in universeNever marriedEducationAdm support including clericalOtherMexican-AmericanMaleNoNot in universeFull-time schedules684900SingleNot in universeNot in universeNonfamily householderHouseholder908.14???Not in universe under 1 year old?6Not in universeMexicoMexicoMexicoForeign born- Not a citizen of U S2Not in universe25295- 50000.
19951887Not in universe007th and 8th grade0Not in universeMarried-civilian spouse presentNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeNot in labor force000Joint both 65+Not in universeNot in universeHouseholderHouseholder955.27???Not in universe under 1 year old?0Not in universeCanadaUnited-StatesUnited-StatesNative- Born in the United States0Not in universe2095- 50000.
19951965Self-employed-incorporated37211th grade0Not in universeMarried-civilian spouse presentBusiness and repair servicesExecutive admin and managerialWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces641809Joint one under 65 & one 65+Not in universeNot in universeHouseholderHouseholder687.19NonmoverNonmoverNonmoverYesNot in universe1Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe25294- 50000.
19952047Not in universe00Some college but no degree0Not in universeMarried-civilian spouse presentNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces00157Joint both under 65Not in universeNot in universeHouseholderHouseholder1923.03???Not in universe under 1 year old?6Not in universePolandPolandGermanyForeign born- U S citizen by naturalization0Not in universe25295- 50000.
19952116Not in universe0010th grade0High schoolNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married4664.87???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe2095- 50000.
19952232Private4230High school graduate0Not in universeNever marriedMedical except hospitalOther serviceBlackAll otherFemaleNoNot in universeChildren or Armed Forces000SingleNot in universeNot in universeNonfamily householderHouseholder1830.11NonmoverNonmoverNonmoverYesNot in universe6Not in universe???Foreign born- Not a citizen of U S0Not in universe25294- 50000.

Duplicate rows

Most frequent

ageclass_of_workerdetailed_industry_recodedetailed_occupation_recodeeducationwage_per_hourenroll_in_edu_inst_last_wkmarital_statmajor_industry_codemajor_occupation_coderacehispanic_originsexmember_of_a_labor_unionreason_for_unemploymentfull_or_part_time_employment_statcapital_gainscapital_lossesdividends_from_stockstax_filer_statregion_of_previous_residencestate_of_previous_residencedetailed_household_and_family_statdetailed_household_summary_in_householdinstance_weightmigration_code-change_in_msamigration_code-change_in_regmigration_code-move_within_reglive_in_this_house_1_year_agomigration_prev_res_in_sunbeltnum_persons_worked_for_employerfamily_members_under_18country_of_birth_fathercountry_of_birth_mothercountry_of_birth_selfcitizenshipown_business_or_self_employedfill_inc_questionnaire_for_veteran's_adminveterans_benefitsweeks_worked_in_yearyearincomecount
5593Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married2125.99???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.6
194711Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1131.62NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0094- 50000.6
1040Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1363.88???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.5
3582Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1182.42???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.5
5903Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married966.31???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.5
6033Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1220.24???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.5
6273Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1803.03NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0094- 50000.5
8815Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married886.02???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.5
14338Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1215.87???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.5
14538Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1979.97???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe0095- 50000.5